Reading ELF header of loaded shared object during runtime - linux

I wrote some code to search for a symbol in a shared library's ELF header. The code works if I parse the shared object file stored on my disk.
Now, I wanted to use this code to parse the ELF header of a loaded shared library. As an example the libdl library is mapped into the current process:
b7735000-b7738000 r-xp 00000000 08:01 315560 /lib/i386-linux-gnu/libdl.so.2
b7738000-b7739000 r--p 00002000 08:01 315560 /lib/i386-linux-gnu/libdl.so.2
b7739000-b773a000 rw-p 00003000 08:01 315560 /lib/i386-linux-gnu/libdl.so.2
The (first) mapping of the address contains the ELF header. I tried to read this header and to extract the dlopen symbol in the .dynsym section. However, the header is slightly different from the one of the 'plain' .so file on the disk. For example the offset of the .shstrtab version is 0. Therefore, it is not possible to get the name of a section.
I wanted to ask why the ELF header is changed during loading of the library and where I can find the 'missing' sections. Is it even possible to parse the ELF header after the library was loaded?
Does anybody know any article explaining the layout of a shared library/its ELF header when it is mapped into a process?
Currently I'm using following functions to iterate over the ELF header. If libdl_start points to the memory mapped libdl.so.2 file, the code works fine. However, if it points to the region mapped by the linker, get_dynstr_section does not find the dynstr section.
int get_libdl_functions()
{
Elf32_Ehdr *ehdr = libdl_start;
Elf32_Shdr *shdr, *shdrs_start = (Elf32_Shdr *)(((char *)ehdr) + ehdr->e_shoff);
Elf32_Sym *symbol, *symbols_start;
char *strtab = get_dynstr_section();
int sec_it = 0, sym_it = 0;
rt_info->dlopen = NULL;
rt_info->dlsym = NULL;
if(strtab == NULL)
return -1;
for(sec_it = 0; sec_it < ehdr->e_shnum; ++sec_it) {
// Iterate over all sections to find .dynsym
shdr = shdrs_start + sec_it;
if(shdr->sh_type == SHT_DYNSYM)
{
// Ok we found the right section
symbols_start = (Elf32_Sym *)(((char *)ehdr) + shdr->sh_offset);
for(sym_it = 0; sym_it < shdr->sh_size / sizeof(Elf32_Sym); ++sym_it) {
symbol = symbols_start + sym_it;
if(ELF32_ST_TYPE(symbol->st_info) != STT_FUNC)
continue;
if(strncmp(strtab + symbol->st_name, DL_OPEN_NAME, sizeof DL_OPEN_NAME) && !rt_info->dlopen) {
//printf("Offset of dlopen: 0x%x\n", symbol->st_value);
dlopen = ((char *)ehdr) + symbol->st_value;
} else if(strncmp(strtab + symbol->st_name, DL_SYM_NAME, sizeof DL_SYM_NAME) && !rt_info->dlsym) {
//printf("Offset of dlsym: 0x%x\n", symbol->st_value);
dlsym = ((char *)ehdr) + symbol->st_value;
}
if(dlopen != 0 && dlsym != 0)
return 0;
}
}
}
return -1;
}
void *get_dynstr_section()
{
Elf32_Ehdr *ehdr = libdl_start;
Elf32_Shdr *shdr, *shdrs_start = (Elf32_Shdr *)(((char *)ehdr) + ehdr->e_shoff);
char *strtab = ((char *)ehdr) + ((shdrs_start + ehdr->e_shstrndx))->sh_offset;
int sec_it = 0;
for(sec_it = 0; sec_it < ehdr->e_shnum; ++sec_it) {
// Iterate over all sections to find .dynstr section
shdr = shdrs_start + sec_it;
if(shdr->sh_type == SHT_STRTAB && strncmp(strtab + shdr->sh_name, DYNSTR_NAME, sizeof DYNSTR_NAME))
return ((char *)ehdr) + shdr->sh_offset;
}
return NULL;
}

You do NOT need to mmap the shared library again - the system already did it- but you cannot rely on the section headers. The section headers are only for the linking view of an ELF file and often aren't allocated into a program segment. You will need to look at it from the execution view. The section .dynstr is always loaded into memory. Otherwise dynamic linking wouldn't work. To get at it, go through the program headers to find the PT_DYNAMIC segment. It will have elements DT_SYMTAB and DT_STRTAB that correspond to .dynsym and .dynstr. You may also have to adjust the address values using a base address. It's very common especially with ASLR for shared objects to be mapped at different virtual addresses than they were linked at. You can find this adjustment amount by subtracting the lowest virtual address in a PT_LOAD entry from the lowest mapped segment in the memory map. Or even better use the link map maintained by ld.so. It contains the base address, the path of the shared object, and a pointer to the shared object's dynamic area. Consult for how this is laid out. If you are running Linux, you might be very interested in the function dl_iterate_phdr(). It's great for finding things about the libraries mapped into the current process image. If you want to examine another process you have to roll your own.

why the ELF header is changed during loading of the library
It isn't. Your question is based on false assumption, but since you didn't show any actual code, it's hard to guess what you've done wrong.
Update:
In this code:
*shdrs_start = (Elf32_Shdr *)(((char *)ehdr) + ehdr->e_shoff);
you assume that sections headers are loaded into memory. But sections headers are not required at runtime, and if they end up loaded into memory, it's only by accident.
You need to read them into memory from disk (or mmap them) yourself, using the e_shoff you got from ehdr.

Related

Why don't multiple threads have to share a lock to call mmap like they do malloc/calloc/sbrk?

I'm working with ptmalloc, and something interesting I came across is when an arena runs out of available chunks (and the top chunk is not large enough) and has to either extend the arena using sbrk() or allocate a non-contiguous region using mmap(). What particularly stood out to me is that in order to allocate more memory using sbrk(), a lock had to be acquired before being able to call it (in addition to the lock previously obtained to be in sole possession of the current arena). However, no lock needs to be acquired before calling mmap(). I have included the specific parts of the sys_alloc() function from the malloc.c file included in the ptmalloc implementation (for reference) below:
Call to extend arena using sbrk():
if (HAVE_MORECORE && tbase == CMFAIL) { /* Try noncontiguous MORECORE */
size_t asize = granularity_align(nb + TOP_FOOT_SIZE + SIZE_T_ONE);
if (asize < HALF_MAX_SIZE_T) {
char* br = CMFAIL;
char* end = CMFAIL;
ACQUIRE_MORECORE_LOCK(); /* LOCK */
br = (char*)(CALL_MORECORE(asize));
end = (char*)(CALL_MORECORE(0));
RELEASE_MORECORE_LOCK(); /* UNLOCK */
if (br != CMFAIL && end != CMFAIL && br < end) {
size_t ssize = end - br;
if (ssize > nb + TOP_FOOT_SIZE) {
tbase = br;
tsize = ssize;
}
}
}
}
Call to extend arena using mmap():
if (HAVE_MMAP && tbase == CMFAIL) { /* Try MMAP */
size_t req = nb + TOP_FOOT_SIZE + SIZE_T_ONE;
size_t rsize = granularity_align(req);
if (rsize > nb) { /* Fail if wraps around zero */
char* mp = (char*)(CALL_MMAP(rsize));
if (mp != CMFAIL) {
tbase = mp;
tsize = rsize;
mmap_flag = IS_MMAPPED_BIT;
}
}
}
Any help understanding why this is able to work even with multiple threads that have the exact same memory pattern (and thus have to extend their arenas at the same time) without having to use locks (i.e., how mmap() is guaranteed to return distinct addresses, even if called simultaneously with a NULL suggested address) would be greatly appreciated.
In the code snippet using sbrk(). It is used to increased the process global heap area. Two calls are issued: the 1st one extends the heap area by rsize bytes and the second gets the resulting address of the new top of the heap (the so-called program's break). The heap area is shared by all the threads of the process. The cuurent top is a global variable for all the threads. Hence, it is protected by a mutex whenever a thread modifies it (shrink/grow operations);
In the code snippet using mmap(), the current thread is allocating a single memory mapped area for itself. The resulting address is only for the calling thread. Hence, no mutual exclusion is necessary from the ptmalloc global data structures point of view as the latter are not modified. A flag IS_MMAPPED_BIT is set in the internal allocated header to indicate to ptmalloc that this is a memory mapped region when it is requested to free it. Concerning mmap() internals, the mutual exclusion is managed inside the kernel.

Linux Kernel code in memory check with sha256 sum

Is there a way of finding the loaded kernel code inside the memory? I mean the bootloader loads the kernel and executes it. The kernel extracts itself and start to initialize the hardware and runs init. As I understand the kernel is saved and loaded from the (b)zImage from disk. This unchanged code I want to find inside the memory of the system and check it.
I have the following enhancement:
Create a sha256 hash of the loaded kernel code and compare it to a defined value to audit the security of the system. Therefore I load a module which tries to find the kernel code inside the memory and compute the sha256 sum out of that.
I tried to find the kernel code in memory like that:
static struct resource *adhoc_next_resource(struct resource *p, bool sibling_only)
{
if (sibling_only)
return p->sibling;
if (p->child)
return p->child;
while (!p->sibling && p->parent)
p = p->parent;
return p->sibling;
}
static struct resource *get_kernel_code (void) {
struct resource *kern_code = &iomem_resource;
while (kern_code && strcmp(kern_code->name ? kern_code->name : "","Kernel code") != 0) {
kern_code = adhoc_next_resource(kern_code, false);
}
return kern_code;
}
int init_module(void)
{
void *start,*end;
size_t length;
SHA256_CTX sha256;
u32 *hash;
struct resource *kern_code;
kern_code = get_kernel_code();
if ( IS_ERR(kern_code) )
return -EINVAL;
start = (void*)phys_to_virt(kern_code->start);
end = (void*)phys_to_virt(kern_code->end);
length = kern_code->end - kern_code->start;
printk("%s[%s]:%s address: %0*llx-%0*llx \n", MODULE_NAME, __FUNCTION__, kern_code->name ? kern_code->name : "", 4, start, 4, end );
printk("%s[%s]: length: %lu \n", MODULE_NAME, __FUNCTION__, length);
printk ( KERN_INFO "%s[%s]: Init sha256\n", MODULE_NAME, __FUNCTION__ );
sha256_init(&sha256);
printk ( KERN_INFO "%s[%s]: Give kernel code to sha256\n", MODULE_NAME, __FUNCTION__ );
sha256_update ( &sha256, start, length );
hash = kmalloc ( 4 * sizeof(u32), GFP_KERNEL );
printk ( KERN_INFO "%s[%s]: Finalize sha256\n", MODULE_NAME, __FUNCTION__ );
sha256_final ( &sha256, (BYTE*)hash );
printk ( KERN_INFO "%s[%s]: Hash value of kernel code: %x - %x - %x - %x \n", MODULE_NAME, __FUNCTION__, hash[0], hash[1], hash[2], hash[3] );
kfree(hash);
return 0;
}
But, I get every time a reboot was done a different sha256 sum.
Please explain that what happens ? Something in the memory of the kernel code changed. But what can it be ?
Is this concept will be worked ? Or is there not every time the same loaded code in the memory.
I have found by myself an answer for this problem:
I have made some more research about memory location of the kernel and so on. I found that during runtime the kernel ".text" section (extracted bzImage) is not stored in the iomem_resource list. I have inspected the address range from the "kernel code" resource (see my previous posted code) and found that most of the bits are 0. Only very at the beginning and at the end there is some code. But this is not "kernel code" I didn't find an answer what the hack is this "kernel code" section. Maybe it has to be the ".text" section of the Kernel Image, but not on my system (ARM 32bit & yocto-linux 4.10.17 ).
The KASLR feature (info here) posted from Yasushi is not the problem for ARM 32bit arichtecture. I found inside the kernel defines that this feature is not enabled on this type of system. But it is a very interesting feature.
To get the kernel address of the ".text" section I use now the kallsysm functionality. This is a big list of all addresses and exported variables. There are two variables who define the start and end. I use the following code:
start = (void*)kallsyms_lookup_name("_stext");
end = (void*)kallsyms_lookup_name("_etext");
Note: Be carefull with virtual memory vs. physical memory location. Kallsysm output is virtual memory location. (Please correct me if I am wrong)
Finally: I get the same checksum for each boot for the same kernel code and I can go on finishing my audit module :-).

Why my implementation of sbrk system call does not work?

I try to write a very simple os to better understand the basic principles. And I need to implement user-space malloc. So at first I want to implement and test it on my linux-machine.
At first I have implemented the sbrk() function by the following way
void* sbrk( int increment ) {
return ( void* )syscall(__NR_brk, increment );
}
But this code does not work. Instead, when I use sbrk given by os, this works fine.
I have tryed to use another implementation of the sbrk()
static void *sbrk(signed increment)
{
size_t newbrk;
static size_t oldbrk = 0;
static size_t curbrk = 0;
if (oldbrk == 0)
curbrk = oldbrk = brk(0);
if (increment == 0)
return (void *) curbrk;
newbrk = curbrk + increment;
if (brk(newbrk) == curbrk)
return (void *) -1;
oldbrk = curbrk;
curbrk = newbrk;
return (void *) oldbrk;
}
sbrk invoked from this function
static Header *morecore(unsigned nu)
{
char *cp;
Header *up;
if (nu < NALLOC)
nu = NALLOC;
cp = sbrk(nu * sizeof(Header));
if (cp == (char *) -1)
return NULL;
up = (Header *) cp;
up->s.size = nu; // ***Segmentation fault
free((void *)(up + 1));
return freep;
}
This code also does not work, on the line (***) I get segmentation fault.
Where is a problem ?
Thanks All. I have solved my problem using new implementation of the sbrk. The given code works fine.
void* __sbrk__(intptr_t increment)
{
void *new, *old = (void *)syscall(__NR_brk, 0);
new = (void *)syscall(__NR_brk, ((uintptr_t)old) + increment);
return (((uintptr_t)new) == (((uintptr_t)old) + increment)) ? old :
(void *)-1;
}
The first sbrk should probably have a long increment. And you forgot to handle errors (and set errno)
The second sbrk function does not change the address space (as sbrk does). You could use mmap to change it (but using mmap instead of sbrk won't update the kernel's view of data segment end as sbrk does). You could use cat /proc/1234/maps to query the address space of process of pid 1234). or even read (e.g. with fopen&fgets) the /proc/self/maps from inside your program.
BTW, sbrk is obsolete (most malloc implementations use mmap), and by definition every system call (listed in syscalls(2)) is executed by the kernel (for sbrk the kernel maintains the "data segment" limit!). So you cannot avoid the kernel, and I don't even understand why you want to emulate any system call. Almost by definition, you cannot emulate syscalls since they are the only way to interact with the kernel from a user application. From the user application, every syscall is an atomic elementary operation (done by a single SYSENTER machine instruction with appropriate contents in machine registers).
You could use strace(1) to understand the actual syscalls done by your running program.
BTW, the GNU libc is a free software. You could look into its source code. musl-libc is a simpler libc and its code is more readable.
At last compile with gcc -Wall -Wextra -g and use the gdb debugger (you can even query the registers, if you wanted to). Perhaps read the x86/64-ABI specification and the Linux Assembly HowTo.

Will "clear_user()" for .bss clearing leads to page fault during ELF loading in function load_elf_binary?

The discussion below applies to 32-bit ARM Linux.
When kernel is loading an ELF executable file, function load_elf_binary will be called.
I believe it is the following code snippet that clears .bss section:
nbyte = ELF_PAGEOFFSET(elf_bss);
if (nbyte) {
nbyte = ELF_MIN_ALIGN - nbyte;
if (nbyte > elf_brk - elf_bss)
nbyte = elf_brk - elf_bss;
if (clear_user((void __user *)elf_bss +
load_bias, nbyte)) {
/*
* This bss-zeroing can fail if the ELF
* file specifies odd protections. So
* we don't check the return value
*/
}
}
However, elf_bss is user land pointer, if this pointer is referenced, I believe a page fault would arise because the physical page frame has not been committed yet for virtual address in that range.
But that's rather confusing that a user land page fault arises while kernel land code is being executed.
Is my interpretation right?

getting to an ELF file information

ok...
so im suppose to write a program that prints all of the sections name in an elf file using only mmap (thats not important...)
so what i did so far is this -
maped the file into the stat structure =
map_start = mmap(0, fd_stat.st_size, PROT_READ | PROT_WRITE , MAP_SHARED, fd, 0)) <0 )
casted it into the write format from the starting point i got =
header = (Elf32_Ehdr *) map_start;
gotten the section header offset from the file =
secoff = header->e_shoff;
now - i know i need to go to the map_start+secoff location - that will give me the section table, and the sh_name will give me an index for the string table...
how to i go to the sting table?
how is it represented?
how do i use it? and is the value in sh_name the index in the string table (if it is represented as an array) , or an offset..
anyway - lets say i want to print the first two section's name - how do i do it givven the code i wrote above
help please?
header = (Elf32_Ehdr *) map_start;
secoff = header->e_shoff;
This is probably wrong. Unless the Elf32_Ehdr structure is explicitly declared __attribute__((packed)), the compiler will eventually insert padding between the members of the structure, so sizeof(Elf32_Ehdr) != (the actual size of an ELF header section). Why not simply use the libelf accessor functions instead?
Update: if you're not allowed to use accessor functions, you'll have to do something like this:
Elf32_Ehdr hdr;
memcpy(&hdr.e_ident, map_start + 0, EI_NIDENT);
memcpy(&hdr.e.type, map_start + 0 + sizeof(Elf32_half), sizeof(Elf32_Half));
et cetera.

Resources