does the loader modify relocation information on program startup? - linux

I have always believed that resolving absolute addresses is completely the linker's job. That is, after the linker combines all object files into one executable, it will then modify all absolute addresses to reflect the new location inside the executable. But after reading in here that the loader does not have to place programs text at the address specified by the linker, I got really confused.
Take this code for example
Main.c
void printMe();
int main(){
printMe();
return 0;
}
Foo.c
/* Lots of other functions*/
void printMe(){
printf("Hello");
}
Say that after linking, the code for main gets placed at address 0x00000010 and the code for printMe gets placed at address 0x00000020. Then when the program is launched, the loader will indeed load main and printMe to their virtual addresses as specified by the linker. But if the loader does not load the program in that manner, then won't that break all absolute address references.

A program is normally composed of several modules created by a linker. There is the executable and usually a number of shared libraries. On some systems one executable can load another executable and call it's starting routine as a function.
If all these compiled uses had fixed addresses, it is likely there would be conflicts upon loading. If two linked modules used the same address, the application could not load.
For decades, relocatable code has been the solution to that problem. A module can be loaded anywhere. Some system take this to the next step and randomly place modules in memory for security.
There are some situations where code cannot be purely relocatable.
If you have something like this:
static int b, *a = &b ;
the initialization depends upon where the model is placed in memory (and where "b" is located). Linkers usually generate information for such constructs so that the loader can fix them up.
Thus, this is not correct:
I have always believed that resolving absolute addresses is completely the linker's job.

According to my knowledge, it's not the case here.
If it is linked statically, then the address of function is calculated statically by th linker. Because the relative address is know, so a relative function call is issued, and everything will be fine.
If it is linked dynamically, then ld.so comes in and loads the lib. The symbol is resolve either by Load-time relocation of shared libraries
or by Position Independent Code (PIC) in shared libraries
(these 2 articles aren't writen by me).
To simply put,
load-time relocation is done by rewriting code to give them the correct address, which disables wirte-protect and share among different processes.
PIC is done by adding 2 sections called GOT and PLT, all at a specific address that can be know at link-time. A call to a function in dynamic lib will first call a ...#plt function (E.x. printf#plt) and then it will jump *GOT[offset]. At the first call, this will actually be the address of the next instruction, which will call the dynamic loader to load the function. At the second call, this will be the address of the function. As you can see, this cost additional memory and time compared to normal code.

Related

How can each process have it's own copy of global data in a shared library

I understand that due to shared libraries not knowing where they will be placed by the dynamic loader, they have to rely on the GOT to resolve all references to global data. For example, a shared library has a global variable named globe, a possible way to access such variable will be something like this mov eax,DWORD PTR [ecx-0x10], assuming that ecx contains the address of the GOT and the offset of globe is 0x10. Now, lets say that process A uses this shared library, immediately followed by process B. I know that the code of a shared library can be shared between processes, but data cannot since each process could potentially change the data depending on its execution. Therefore, each process will get it's own GOT, which means that, thanks to virtual memory, the address ecx + 0x10 will point to completely two different GOTs depending on what process is running that piece of code. But then say one of the processes loads a second shared library with a different global data member at offset 0x10 in its GOT. How exactly can the process using the two libraries access each libraries' global data if they are both at the same virtual address?
But then say one of the processes loads a second shared library with a different global data member at offset 0x10 in its GOT. How exactly can the process using the two libraries access each libraries' global data if they are both at the same virtual address?
There are three parts to the answer:
How does each library access its own globals?
How does each library access the other libraries's globals?
How does main executable access globals from either library?
The best way to understand this is to compile two trivial libraries, and a main binary, and then examine various GOT sections and observe when they change and in what way.
The root of your confusion appears to be that assume that there is only one GOT. That is not the case: each library will have its own .got section, and the compiler and the runtime loader will arrange it so that ecx points to the right .got.
For main executable, the answer is "copy relocations".
Here is a good article on the subject.

Is it possible to force a range of virtual addresses?

I have an Ada program that was written for a specific (embedded, multi-processor, 32-bit) architecture. I'm attempting to use this same code in a simulation on 64-bit RHEL as a shared object (since there are multiple versions and I have a requirement to choose a version at runtime).
The problem I'm having is that there are several places in the code where the people who wrote it (not me...) have used Unchecked_Conversions to convert System.Addresses to 32-bit integers. Not only that, but there are multiple routines with hard-coded memory addresses. I can make minor changes to this code, but completely porting it to x86_64 isn't really an option. There are routines that handle interrupts, CPU task scheduling, etc.
This code has run fine in the past when it was statically-linked into a previous version of the simulation (consisting of Fortran/C/C++). Now, however, the main executable starts, then loads a shared object based on some inputs. This shared object then checks some other inputs and loads the appropriate Ada shared object.
Looking through the code, it's apparent that it should work fine if I can keep the logical memory addresses between 0 and 2,147,483,647 (32-bit signed int). Is there a way to either force the shared object loader to leave space in the lower ranges for the Ada code or perhaps make the Ada code "think" that it's addresses are between 0 and 2,147,483,647?
Is there a way to either force the shared object loader to leave space in the lower ranges for the Ada code
The good news is that the loader will leave the lower ranges untouched.
The bad news is that it will not load any shared object there. There is no interface you could use to influence placement of shared objects.
That said, dlopen from memory (which we implemented in our private fork of glibc) would allow you to do that. But that's not available publicly.
Your other possible options are:
if you can fit the entire process into 32-bit address space, then your solution is trivial: just build everything with -m32.
use prelink to relocate the library to desired address. Since that address should almost always be available, the loader is very likely to load the library exactly there.
link the loader with a custom mmap implementation, which detects the library of interest through some kind of side channel, and does mmap syscall with MAP_32BIT set, or
run the program in a ptrace sandbox. Such sandbox can again intercept mmap syscall, and or-in MAP_32BIT when desirable.
or perhaps make the Ada code "think" that it's addresses are between 0 and 2,147,483,647?
I don't see how that's possible. If the library stores an address of a function or a global in a 32-bit memory location, then loads that address and dereferences it ... it's going to get a 32-bit truncated address and a SIGSEGV on dereference.

How does GDB perform base addresses of shared libraries [ internals of info sharedlibrary command]

I am trying to understand the internal working behind GDB commands. After initial homework of understanding about elf / shared libraries / address space randomization, I attempted to understand how GDB make sense between the executable and corefile.
solib.c contains the implementation of shared library processing. Esp am interested in the info sharedlibrary command.
The comment on the solib.c goes like this..
/* Relocate the section binding addresses as recorded in the shared
object's file by the base address to which the object was actually
mapped. */
ops->relocate_section_addresses (so, p);
I could not understand much from this comment. Can somebody explain me in plain english how relocation happens? i.e Every time when an executable loads a shared object, it is going to load at some location say X, and all the symbols inside the shared library will be located at fixed offset, say X+Y with some size Z. My question is, how does gdb does the same range of address relocation, so that it matches with the load segments in the corefile. How it takes that hint from executable.
how does gdb does the same range of address relocation, so that it matches with the load segments in the corefile
In other words, how does GDB find the relocation X?
The answer depends on the operating system.
On Linux, GDB finds _DYNAMIC[] array of struct Elf{32,64}_Dyns in the core file, which contains an element with .d_tag == DT_DEBUG.
The .d_ptr in that element points to struct r_debug (see /usr/include/link.h), which points to a linked list of struct link_maps, which describe all loaded shared libraries and their relocations in l_addr.
The relevant file in GDB is solib-svr4.c.
EDIT:
I see that, there are no .dynamic sections in the corefile.
There shouldn't be. There is a .dynamic section in the executable and a matching LOAD segment in the core (the segment will "cover" the .dynamic section, and have the contents that was there at runtime).

are ".o" files "loadable"?

I have been reading John R. Levine's Linkers and Loaders and I read that the properties of an object file will include one or more of the following.
file should be linkable
file should be loadable
file should be executable
Now, considering this example:
#include<stdio.h>
int main() {
printf("testing\n");
return 0;
}
Which I would compile and link with:
$ gcc -c t.c
$ gcc -o t t.o
I tried inspecting t.o using objdump and its type shows up as REL. What all properties does t.o satisfy? I believe that its linkable, non-executable. I would have believed that its not loadable(unless you create an .so file from the .o file); however the type REL means that its supposed to be relocated, and relocation would occur only in the context of loading, so I'm having a confusion here.
My doubts summarized :-
Are ".o" files loadable?
Reading resources regarding the sections present in a ".o", ".so" file - differences etc?
An object file (i.e., a file with the .o extension) is not loadable. This is because it lacks critical information about how to resolve all the symbols within it: in this case, the println symbol in particular would need additional information. (C compilers do not bind library identities into the object files they create, which is occasionally even useful.)
When you link the object file into a shared library (.so), you are adding that binding. Typically, you're also grouping a number of object files together and resolving references between them (plus a few more esoteric things). That then makes the result possible to load, since the loader can then just do resolution of references and loading of dependencies that it doesn't already know about.
Going from there to executable is typically just a matter of adding on the OS-defined program bootstrap. This is a small piece of code that the OS will start the program running by calling, and it typically works by loading the rest of the program and dependencies and then calling main() with information about the arguments. (It's also responsible for exiting cleanly if main returns.)
Just to set the context this link states somethings similar (emphasis for readability only);
A file may be linkable, used as input by a link editor or linking
loader. It my be executable, capable of being loaded into
memory and run as a program, loadable, capable of being loaded
into memory as a library along with a program, or any combination of
the three.
A .o file is a linker object file, which is according to this definition not executable and definitely linkable. Loadable is a tougher call, but since .o files are not loadable without some definitely not cross platform trickery, I'd say the spirit is that it's not loadable.

How tcamalloc gets linked to the main program

I want to know how malloc gets linked to the main program.Basically I have a program which uses several static and dynamic libraries.I am including all these in my makefile using option "-llibName1 -llibName2".
The documentation of TCmalloc says that we can override our malloc simply by calling "LD_PRELOAD=/usr/lib64/libtcmalloc.so".I am not able to understand how tcamlloc gets called to the all these static and dynamic libraries.Also how does tcmalloc also gets linked to STL libraries and new/delete operations of C++?
can anyone please give any insights on this.
"LD_PRELOAD=/usr/lib64/libtcmalloc.so" directs the loader to use libtcmalloc.so before any other shared library when resolving symbols external to your program, and because libtcmalloc.so defines a symbol named "malloc", that is the verison your program will use.
If you omit the LD_PRELOAD line, glibc.so (or whatever C library you have on your system) will be the first shared library to define a symbol named "malloc".
Note also that if you link against a static library which defines a symbol named "malloc" (and uses proper arguments, etc), or another shared library is loaded that defines a symbol named "malloc", your program will attempt to use that version of malloc.
That's the general idea anyway; the actual goings-on is quite interesting and I will have to direct to you http://en.wikipedia.org/wiki/Dynamic_linker as a starting point for more information.

Resources