Locating and Editing Dynamic Symbol Table of Loaded Program? - linux

My goal is explained in this question HERE
Is it possible to locate the address of a symbol's entry in the dynamic symbol table loaded into a program?
If we can locate it, can we edit it somehow? For example if the app made the call to a function named original_func then the control should actually come to my hook_func and from there I call the original_func.
Update:
Some code according to the answer by 'Employed Russian':
extern Elf32_Dyn _DYNAMIC[];
int i=0;
uint32_t DST_base_addr;
Elf32_Dyn *dyn;
for (dyn = _DYNAMIC; dyn->d_tag != DT_NULL; ++dyn)
{
if(dyn->d_tag==DT_SYMTAB)
{
DST_base_addr=dyn->d_un.d_ptr;
LOGE("Base address of dynamic symbol table is; 0x%x", DST_base_addr);
break;
}
}
Output: 0x148
1- Not sure what that 0x148 means. It's definitely not an absolute address.
2- Also, where can I find good listing of these useful pre-defined variables such as _DYNAMIC[] _GLOBAL_OFFSET_TABLE_ etc.? I wasn't very aware of such variables even when I went through ELF notes here and there.

Is it possible to locate the address of a symbol's entry in the dynamic symbol table loaded into a program?
Yes, it's pretty easy: iterate over elements of the _DYNAMIC[] array, until you find an element with .d_tag == DT_SYMTAB. The .d_un.d_ptr of that entry will point to the dynamic symbol table in memory.
To find a specific symbol, you will also need to refer to DT_STRTAB.
If we can locate it, can we edit it somehow?
Sure: it's just a memory location. You may need to mprotect it to be writable, but once you do, you can modify it to your heart's content.
However, most modifications will either have no effect, or cause your program to crash later.
For example if the app made the call to a function named original_func then the control should actually come to my hook_func and from there I call the original_func.
It's pretty difficult to achieve your stated goal using this particular method, and much easier methods exist.
Perhaps you are looking for this?

Related

How can I correctly create a new variable in vvar.h for my new VSDO func?

I am trying to declare a new variable in vvar.h and define it near my new VDSO function. So that I could use this variable in my vdso function.
I have a question about VVar. According to the description in arch/x86/include/asm/vvar.h, when I declare here a new variable as DECLARE_VVAR(0, int, count), I should use DEFINE_VVAR(type, name) to define this variable somewhere else.
The problem is after I defined this variable somewhere else, like DEFINE_VVAR(int, count), when I am trying to assign an integer value to this variable count, it is failed. This is because after kernel version 5.2 #define DEFINE_VVAR(type, name) has been changed from #define DEFINE_VVAR(type, name) type name to #define DEFINE_VVAR(type, name) type name[CS_BASES]. Right now this variable count is an integer array instead of type integer. Therefore I can't assign a integer value to it. Do you know how to fix it?
VVAR.h: https://elixir.bootlin.com/linux/v5.12/source/arch/x86/include/asm/vvar.h#L43
Typically, you cannot add a variable simply through DECLARE_VVAR macro.
The first thing you have to be aware of is that .vvar is a page of memory located inside the memory (more specifically, before .vdso) and could access by both kernel and userland. You can see this inside the linker script https://elixir.bootlin.com/linux/latest/source/arch/x86/entry/vdso/vdso-layout.lds.S. For now, kernel already has a data structure `struct video to format the data inside this page.
Second, assume you want to add a variable inside the .vvar page and access it in your new vdso function, the easiest way is to add it inside the sturct vdso structure of include/vdso/datapage.h: https://elixir.bootlin.com/linux/latest/source/include/vdso/datapage.h. After that, you can update them inside the kernel (for example, in schedule) in the same way as other vvar variables.
Second, if you want to own your own vvar page, you have to define your own vvar data structure inside the datapage.h and do not forget DEFINE_VVAR in vsyscall.h: https://elixir.bootlin.com/linux/latest/source/arch/x86/include/asm/vdso/vsyscall.h ALso, since the vvar memory layout is compact, you also need to allocate another page through linker script: https://elixir.bootlin.com/linux/latest/source/arch/x86/entry/vdso/vdso-layout.lds.S by change vvar_start = . - 4 * PAGE_SIZE; into vvar_start = . - 5 * PAGE_SIZE;

i am playing with processes etc. but I dont know how to add "client.dll" to hex value

In cheat engine you can do "client.dll"+00D3AC5C and in reclass <client.dll>+00D3AC5C
how to do the same in python I am using ReadWriteMemory but I will soon change it for something more complex. Can you tell me please how to do it with RWM or with something other ?
According to the source code of that library, there's seemingly no way to get the base address of a process.
However you can get the base address by bypassing the library and doing it yourself via this method. Then, once you have the hex value of the base address, you can then simply add an offset to it, then use RWM's read() or get_pointer().

Get address of element of struct from map file or compiler output files

Using the map file I can find the address of any variable of my C software. I would like to have access to structure element addresses also, not only to the address to the struct.
Is there an easy way of getting this without needing to parse the whole code and look for the struct definitions or manually adding the offset to the struct variable? I can't seem to find anything helpful in the .map file alone but perhaps other compiler output files could have more information.

Get the end address of Linux kernel function on run-time

I am trying to get the boundary for a kernel function (system calls for example). Now, if I understand correctly, I can get the start address of the interested function by reading /proc/kallsyms or System.map but I dont know how to get the end address of this function.
As you may know, /proc/kallsyms allow us to view the symbol table for Linux kernel so we can see the start address of all exported symbols. Can we use the start address of the next function to calculate the end address of the previous function? If we cannot do like this, could you suggest me another ways?
Generally, executables store only the start address of a function, as it is all that is required to call the function. You will have to infer the end address, rather than simply looking it up.
You could try to find the start address of the subsequent function, but that wouldn't always work either. Imagine the following:
void func_a() {
// do something
}
static void helper_function() {
// do something else
}
void func_b() {
// ...
helper_function();
// ...
}
You could get the address of func_a and func_b, but helper_function would not show up, because nothing needs to link to it. If you tried to use func_b as the end of func_a (assuming that the order in the compiled code in equivalent to the order in the source code, which is not guaranteed), you would end up accidentally including code that you didn't need to include - and might not find code that you need to find when inlining other functions into func_b.
So, how do we find this information? Well, if you think about it - the information does exist - all of the paths within func_a will eventually terminate (in a loop, return statement, tail call, etc), probably before helper_function begins.
You would need to parse out the code of func_a and build up a map of all of the possible code paths within it. Of course, you would need to do this anyway to inline other functions into it - so it shouldn't be too much harder to simply not care about the end address of the function.
One final note: in this example, you would have trouble finding helper_function in order to know to inline it, because the symbol wouldn't show up in kallsyms. The solution here is that you can track the call instructions in individual functions to determine what hidden functions exist that you didn't know about otherwise.
TL;DR: You can only find the end address by parsing the compiled code. You have to parse this anyway, so just do it once.

Finding the load address of a shared library in Linux

At runtime I need to print out an address, and then find which function that address is part of. The functions are in a shared library so are not at a fixed address. My map file obviously just shows the relative offsets for each shared library func. Is it possible at runtime to query where a library has been loaded, so that I can subtract that value from my address to get the correct map file offset?
Currently I'm doing a slightly hacky approch whereby I also print out the address of one function in the library, then find that function in the map file to figure out where the load address must be. I would rather have a generic method that didn't require you to name a reference function.
(GDB is not available in my setup).
Thanks.
On a recent linux, you can use dl_iterate_phdr to find out the addresses of the shared libs.
dladdr does this, end of. :-)
More comprehnsively, dladdr will take an address and work out which library and symbol it corresponds to... and then it'll give you the name of the library, the name of the symbol, and the base addresses of each. Personally I think that's nifty, it's also making my current debugging job a lit easier.
Hope this helps!
Try to take a look at /proc/[PID]/maps file.
This should contain the address of library mapping in process memory address space.
If you want to reach the executable portion, use readelf on your library and find the offset of .text section.
Check the System V ABI, chapter 5. For the lazy ones out there, here is the standard way to do this for systems supporting the ELF binary format:
#include <link.h>
off_t load_offset;
for (Elf64_Dyn *dyn = _DYNAMIC; dyn->d_tag != DT_NULL; ++dyn) {
if (dyn->d_tag == DT_DEBUG) {
struct r_debug *r_debug = (struct r_debug *) dyn->d_un.d_ptr;
struct link_map *link_map = r_debug->r_map;
while (link_map) {
if (strcmp(link_map->l_name, libname) == 0) {
load_offset = (off_t)link_map->l_addr;
break;
}
link_map = link_map->l_next;
}
break;
}
}
This does not rely on any GNU extension.
On GNU systems, the macro ElfW(Dyn) will return either Elf64_Dyn or Elf32_Dyn which is handy.
GNU backtrace library?
I wouldn't rely on hacky approaches for something like this. I don't think there's any guarantee that the contents of libraries are loaded into contiguous memory.

Resources