Why can a shared library created from non-pic object work? - linux

I'm confused. I try in Linux on x86.

PIC just makes live more simple for the loader since it only has to modify a few global addresses in the code. Non-PIC code just contains a lot more of these addresses, so the table with addresses which need relocation are bigger. But the loader must be able to relocate the code in either case (for example, to resolve the addresses of static/global variables and all function pointers).

x86 ABI kind of supports non-PIC code in shared libraries. As pointed out before it means pages that will normally be shared will not be shared (because ld.so needs to patch references in code rather special place (GOT)).
But libraries built that way may be a bit faster, because PIC code is generally slower.
amd64 ABI does not support that.

Related

Does the .so file still contain infomation about label

In what phase of compilation does the compiler replace label into actual addr
I understanding instructions like jmp abc where abc is just a note and will be replace to actual address eventually, does it ?
Does the final .so file still contain infomation about label or the label is replace to actual addr when its load in the memory ?
TL;DR - your question is hard to answer, because it is mixing a few concepts. For typical assembler labels, we use PC relative and labels are resolve at assemble time. For other 'external' labels, there are many cases and the resolution depends on the case.
There are four conceptual ways to address on almost all CPUs, and definitely on the ARM.
PC relative address. Current instruction +/- offset.
Absolute address. This is the one you are conceptually thinking of.
Register computed address. Calculated at run time. ldr pc, [rn, #xx]
Table based addressing. Global offset table, etc. Much like registers computed addresses. ldr pc, [Rbase, Rindex, lsl #2]
The first two fit in a single instruction and are very efficient. The first is most desirable as the code can execute at ANY address as long as it maintains it's original layout (ie, you don't load it by splitting the code up).
In the table above, there is also the concept of 'build time' and 'run time'. The distinction is the difference between a linker and a loader. You have tagged this 'linux' and refer to an 'so' or shared library. Also, you are referring to assembler 'labels'. They are very similar concepts, but can be different as they will be one of the four classes of addressing above.
Most often in assembler, the labels are PC relative. There is no additional structure to be implemented with PC relative, except to keep the chunk of code continuous. In the case of an assembler that is a 'module' (compilation unit, for a compile) or is processed by the assembler and produced an 'object', it will use a PC relative addressing.
The object format can be annotate with external addresses and there are many choices in how an assembler may output these address. They are generally controlled by 'psuedo-ops'. That is a note (separate section with defined format) in the object file; the instruction is semi-complete in this form. It may prepare to use an offset table, use a register based compute (like r9+constant), etc.
For the typical case of linking (done at build time), we will either use PC relative or absolute. If we fix our binary to only run at one address, the assembler can setup for absolute addressing and resolve these through linking. In this case, the binary must be loaded at a fixed address. The assembler 'modules' or object files can be completely glued together to have everything resolved. Then there is no 'load' time fix ups. Other determining factor are whether code/data are separate, and whether the system is using an MMU. It is often desirable to keep code constant, so that many processes can use the same RAM/ROM pages, but they will have separate data. As well as memory efficient, this can provide some form of security (although it is not extremely robust) it will prevent accidental code overwrites and will provide debugging help in the form of SIGSEGV.
It is possible to write a PC-relative initialization routine which will do the fix-ups to create a table in your own binary. So a 'loader' is just to determine where you are running and then make calculations. For statically shared libraries, you typically know the libraries you will run, but not where they are. For dynamically shared libraries, you might not even know at compile time what the library is that you will run.
A Linux distribution can use either. If you have some sort of standard Linux Desktop distribution, (Ubuntu/Debian, Redhat, etc). You will have something base on ARM ELF LSB and dynamic shared libraries. You need to use the assembler pseudo ops to support this type of addressing or use a compiler to do it for you. The majority of all 'labels' in a shared library will be PC relative and not show up. Some labels can show up for debugging reasons (man strip) and some are absolutely needed to resolve addresses at run time.
I have also asked a question that I find related some time ago, Using GCC pre-processor as an assembler... So the key concept is that the assembler is generally 'two pass' and needs to do these local address fix ups. Then this question asks a 2nd level Concept A/B where we are adding shared libraries. The online book Linkers and Loaders is a great resource if you want to known more.
See also:
Static linked shared libraries
Thumb start function
What is the point of busybox?
Final executable has to have all addresses, otherwise it would not work.
Thing to remember is there are static linking and dynamic linking (eg using shared libraries). In case of static linkage binary file has all addresses resolved. In case of dynamic linkage addresses are resolved during loading, while binary has relocation information that are replaced with actual addresses by dynamic linker. But by the end of a day, loaded binary in memory has all addresses.
In what phase of compilation does the compiler replace label into
actual addr
Compiler could replace with actual address when it knows destination address. For example that's a call to function in same compilation unit.
When destination address is outside of compilation unit and out of reach for compiler, compiler leaves a relocation information in object file. Linker replace that with an actual address in memory at same time.

Shared library symbol conflicts and static linking (on Linux)

I'm encountering an issue which has been elaborated in a good article Shared Library Symbol Conflicts (on Linux). The problem is that when the execution and .so have defined the same name functions, if the .so calls this function name, it would call into that one in execution rather than this one in .so itself.
Let's talk about the case in this article. I understand the DoLayer() function in layer.o has an external function dependency of DoThing() when compiling layer.o.
But when compiling the libconflict.so, shouldn't the external function dependency be resolved in-place and just replaced with the address of conflict.o/DoThing() statically?
Why does the layer.o/DoLayer() still use dynamic linking to find DoThing()? Is this a designed behavior?
Is this a designed behavior?
Yes.
At the time of introduction of shared libraries on UNIX, the goal was to pretend that they work just as if the code was in a regular (archive) library.
Suppose you have foo() defined in both libfoo and libbar, and bar() in libbar calls foo().
The design goal was that cc main.c -lfoo -lbar works the same regardless of whether libfoo and libbar are archive or a shared libraries. The only way to achieve this is to have libbar.so use dynamic linking to resolve call from bar() to foo(), despite having a local version of foo().
This design makes it impossible to create a self-contained libbar.so -- its behavior (which functions it ends up calling) depends on what other functions are linked into the process. This is also the opposite of how Windows DLLs work.
Creating self-contained DSOs was not a consideration at the time, since UNIX was effectively open-source.
You can change the rules with special linker flags, such as -Bsymbolic. But the rules get complicated very quickly, and (since that isn't the default) you may encounter bugs in the linker or the runtime loader.
Yes, this is a designed behavior. When you link a program into a binary, all the references to named external (non-static) functions are resolved to point into the symbol table for the binary. Any shared libraries that are linked against are specified as DT_NEEDED entries.
Then, when you run the binary, the dynamic linker loads each required shared library to a suitable address and resolves each symbol to an address. Sometimes this is done lazily, and sometimes it is done once at first startup. If there are multiple symbols with the same name, one of them will be chosen by the linker, and your program will likely crash since you may not end up with the right one.
Note that this is the behavior on Linux, which has all symbols as a flat namespace. Windows resolves symbols differently, using a tree topology, which has both advantages (fewer conflicts) and disadvantages (the inability to allocate memory in one library and free it in another).
The Linux behavior is very important if you want things like LD_PRELOAD to work. This allows you to use debugging tools like Electric Fence and CPU profiling tools like the Google performance tools, or replace a memory allocator at runtime. None of these things would work if symbols were preferentially resolved to their binary or shared library.
The GNU dynamic linker does support symbol versions, however, so that it's possible to load multiple versions of a shared library into the same program. Oftentimes distros like Debian will do this with libraries they expect to change frequently, like OpenSSL. If the program uses liba which uses OpenSSL 1.0 and libb which uses OpenSSL 1.1, then the program should still function in such a case since OpenSSL has versioned symbols, and each library will use the appropriate version of the relevant symbol.

Interpose statically linked binaries

There's a well-known technique for interposing dynamically linked binaries: creating a shared library and and using LD_PRELOAD variable. But it doesn't work for statically-linked binaries.
One way is to write a static library that interpose the functions and link it with the application at compile time. But this isn't practical because re-compiling isn't always possible (think of third-party binaries, libraries, etc).
So I am wondering if there's a way to interpose statically linked binaries in the same LD_PRELOAD works for dynamically linked binaries i.e., with no code changes or re-compilation of existing binaries.
I am only interested in ELF on Linux. So it's not an issue if a potential solution is not "portable".
One way is to write a static library that interpose the functions and link it with the application at compile time.
One difficulty with such an interposer is that it can't easily call the original function (since it has the same name).
The linker --wrap=<symbol> option can help here.
But this isn't practical because re-compiling
Re-compiling is not necessary here, only re-linking.
isn't always possible (think of third-party binaries, libraries, etc).
Third-party libraries work fine (relinking), but binaries are trickier.
It is still possible to do using displaced execution technique, but the implementation is quite tricky to get right.
I'll assume you want to interpose symbols in main executable which came from a static library which is equivalent to interposing a symbol defined in executable. The question thus reduces to whether it's possible to intercept a function defined in executable.
This is not possible (EDIT: at least not without a lot of work - see comments to this answer) for two reasons:
by default symbols defined in executable are not exported so not accessible to dynamic linker (you can alter this via -export-dynamic or export lists but this has unpleasant performance or maintenance side effects)
even if you export necessary symbols, ELF requires executable's dynamic symtab to be always searched first during symbol resolution (see section 1.5.4 "Lookup Scope" in dsohowto); symtab of LD_PRELOAD-ed library will always follow that of executable and thus won't have a chance to intercept the symbols
What you are looking for is called binary instrumentation (e.g., using Dyninst or ptrace). The idea is you write a mutator program that attaches to (or statically rewrites) your original program (called mutatee) and inserts code of your choice at specific points in the mutatee. The main challenge usually revolves around finding those insertion points using the API provided by the instrumentation engine. In your case, since you are mainly looking for static symbols, this can be quite challenging and would likely require heuristics if the mutatee is stripped of non-dynamic symbols.

Is it valid to link non PIC objects into an executable with PIC objects

I'm adding a thread local variable to a couple of object files that are always linked directly to executables. These objects will never be included inside a shared library (and it's safe to assume this will hold true for the foreseeable future). This means the -fPIC flag is not required for these objects, correct?
Our codebase has the -fPIC flag for all objects by default. Many of these are included in shared libraries so the use of -fPIC makes sense. However, this flag presents an issue debugging the new thread local variable because GDB crashes while stepping over thread local variable with -fPIC. If I remove -fPIC from those few object files with the new thread local variable, I can debug properly.
I can't find any authoritative statements that mixing non-PIC objects with PIC objects in an executable is okay. My testing thus far shows it's okay, but it does not feel kosher, and online discussion is generally "do not mix PIC and non PIC" due to the shared library case.
Is it safe to link non PIC objects into an executable built with PIC objects and libraries in this case? Maybe there is an authoritative statement from GCC docs on this being safe, but I cannot find it.
EDIT: Binary patching gcc to avoid this bug is not a solution in the short-term. Switching compiler on Linux is not a possible solution.
Except for Bugs like the above it should be fine. I cant deliver you references to definitive documents describing this, but only speak from experience.
gcc (or the assembler) will produce different code when you specify -fPIC, but the resulting code still uses standardized relocation symbols.
For linking pieces together, this doesnt matter at first, a linker will just stubbornly string everything together and doesnt know whether the code denotes PIC on non-PIC code. I know this because I work with systems which dont support shared libraries, and I had to wrap my own loaders.
The final point tough, is that you can tell the linker if the resulting object should be a shared library or not. Only then will the linker generate some (OS-specific) Structures and symbols to denote im-/exports.
Otherwise the linker will just finish its work, the primary difference is that missing symbols will result in an error.
The clean separation between Compiler + Linker should guarantee that the flags should not matter (outside of performance differences). I would be careful with LTO tough, this had several problems with different compiler-settings in the past.
As said, I spent some time investigating this and red several docs about ELF and dynamic loaders. You will find an explicit mention of linking PIC/non-PIC nowhere, but the linking process really doesn`t care about the compiler-settings for the inputs, valid code will stay valid code.
If you want to link non-PIC code to a shared library (PIC), the linker will quit if absolute relocation`s are encountered (which is very likely).
If you want to link any code to a program, you are only limited to what the final program can deal with. On a OS supporting PIC you can use anything, otherwise the linker might complain about missing symbols or unsupported sections/relocation types.
It is possible almost always, but sometimes it requires some tricks

How to create a shared object that is statically linked with pthreads and libstdc++ on Linux/gcc?

How to create a shared object that is statically linked with pthreads and libstdc++ on Linux/gcc?
Before I go to answering your question as it was described, I will note that it is not exactly clear what you are trying to achieve in the end, and there is probably a better solution to your problem.
That said - there are two main problems with trying to do what you described:
One is, that you will need to decompose libpthread and libstdc++ to the object files they are made with. This is because ELF binaries (used on Linux) have two levels of "run time" library loading - even when an executable is statically linked, the loader has to load the statically linked libraries within the binary on execution, and map the right memory addresses. This is done before the shared linkage of libraries that are dynamically loaded (shared objects) and mapped to shared memory. Thus, a shared object cannot be statically linked with such libraries, as at the time the object is loaded, all static linked libraries were loaded already. This is one difference between linking with a static library and a plain object file - a static library is not merely glued like any object file into the executable, but still contains separate tables which are referred to on loading. (I believe that this is in contrast to the much simpler static libraries in MS-DOS and classic Windows, .LIB files, but there may be more to those than I remember).
Of course you do not actually have to decompose libpthread and libstdc++, you can just use the object files generated when building them. Collecting them may be a bit difficult though (look for the objects referred to by the final Makefile rule of those libraries). And you would have to use ld directly and not gcc/g++ to link, to avoid linking with the dynamic versions as well.
The second problem is consequential. If you do the above, you will sure have such a shared object / dynamic library as you asked to build. However, it will not be very useful, as once you try to link a regular executable that uses those libpthread/libstdc++ (the latter being any C++ program) with this shared object, it will fail with symbol conflicts - the symbols of the static libpthread/libstdc++ objects you linked your shared object against will clash with the symbols from the standard libpthread/libstdc++ used by that executable, no matter if it is dynamically or statically linked with the standard libraries.
You could of course then try to either hide all symbols in the static objects from libstdc++/libpthread used by your shared library, make them private in some way, or rename them automatically on linkage so that there will be no conflict. However, even if you get that to work, you will find some undesireable results in runtime, since both libstdc++/libpthread keep quite a bit of state in global variables and structures, which you would now have duplicate and each unaware of the other. This will lead to inconsistencies between these global data and the underlying operating system state such as file descriptors and memory bounds (and perhaps some values from the standard C library such as errno for libstdc++, and signal handlers and timers for libpthread.
To avoid over-broad interpretation, I will add a remark: at times there can be sensible grounds for wanting to statically link against even such basic libraries as libstdc++ and even libc, and even though it is becoming a bit more difficult with recent systems and versions of those libraries (due to a bit of coupling with the loader and special linker tricks used), it is definitely possible - I did it a few times, and know of other cases in which it is still done. However, in that case you need to link a whole executable statically. Static linkage with standard libraries combined with dynamic linkage with other objects is not normally feasible.
Edit: One issue which I forgot to mention but is important to take into account is C++ specific. C++ was unfortunately not designed to work well with the classic model of object linkage and loading (used on Unix and other systems). This makes shared libraries in C++ not really portable as they should be, because a lot of things such as type information and templates are not cleanly separated between objects (often being taken, together with a lot of actual library code at compile time from the headers). libstdc++ for that reason is tightly coupled with GCC, and code compiled with one version of g++ will in general only work with the libstdc++ from with this (or a very similar) version of g++. As you will surely notice if you ever try to build a program with GCC 4 with any non-trivial library on your system that was built with GCC 3, this is not just libstdc++. If your reason for wanting to do that is trying to ensure that your shared object is always linked with the specific versions of libstdc++ and libpthread that it was built against, this would not help because a program that uses a different/incompatible libstdc++ would also be built with an incompatible C++ compiler or version of g++, and would thus fail to link with your shared object anyway, aside from the actual libstdc++ conflicts.
If you wonder "why wasn't this done simpler?", a general rumination worth pondering: For C++ to work nicely with dynamic/shared libraries (meaning compatibility across compilers, and the ability to replace a dynamic library with another version with a compatible interface without rebuilding everything that uses it), not just compiler standartization is needed, but at the level of the operating system's loader, the structure and interface of object and library files and the work of the linker would need to be significantly extended beyond the relatively simple Unix classics used on common operating systems (Microsoft Windows, Mach based systems and NeXTStep relatives such as Mac OS, VMS relatives and some mainframe systems also included) for natively built code today. The linker and dynamic loader would need to be aware of such things as templates and typing, having to some extent functionality of a small compiler to actually adapt the library's code to the type given to it - and (personal subjective observation here) it seems that higher-level intermediate intermediate code (together with higher-level languages and just-in-time compilation) is catching ground faster and likely to be standardized sooner than such extensions to the native object formats and linkers.
You mentioned in a separate comment that you are trying to port a C++ library to an embedded device. (I am adding a new answer here instead of editing my original answer here because I think other StackOverflow users interested in this original question may still be interested in that answer in its context)
Obviously, depending on how stripped down your embedded system is (I have not much embedded Linux experience, so I am not sure what is most likely), you may of course be able to just install the shared libstdc++ on it and dynamically link everything as you would do otherwise.
If dynamically linking with libstdc++ would not be good for you or not work on your system (there are so many different levels of embedded systems that one cannot know), and you need to link against a static libstdc++, then as I said, your only real option is static linking the executable using the library with it and libstdc++. You mentioned porting a library to the embedded device, but if this is for the purpose of using it in some code you write or build on the device and you do not mind a static libstdc++, then linking everything statically (aside from perhaps libc) is probably OK.
If the size of libstdc++ is a problem, and you find that your library is actually only using a small part of its interfaces, then I would nonetheless suggest first trying to determine the actual space you would save by linking against only the parts you need. It may be significant or not, I never looked that deep into libstdc++ and I suspect that it has a lot of internal dependencies, so while you surely do not need some of the interfaces, you may or may not still depend on a big part of its internals - I do not know and did not try, but it may surprise you. You can get an idea by just linking a binary using the library against a static build of it and libstdc++ (not forgetting to strip the binary, of course), and comparing the size of the resulting executable that with the total size of a (stripped) executable dynamically linked together with the full (stripped) shared objects of the library and libstdc++.
If you find that the size difference is significant, but do not want to statically link everything, you try to reduce the size of libstdc++ by rebuilding it without some parts you know that you do not need (there are configure-time options for some parts of it, and you can also try to remove some independent objects at the final creation of libstdc++.so. There are some tools to optimize the size of libraries - search the web (I recall one from a company named MontaVista but do not see it on their web site now, there are some others too).
Other than the straightforward above, some ideas and suggestions to think of:
You mentioned that you use uClibc, which I never fiddled with myself (my experience with embedded programming is a lot more primitive, mostly involving assembly programming for the embedded processor and cross-compiling with minimal embedded libraries). I assume you checked this, and I know that uClibc is intended to be a lightweight but rather full standard C library, but do not forget that C++ code is hardly independent on the C library, and g++ and libstdc++ depend on quite some delicate things (I remember problems with libc on some proprietary Unix versions), so I would not just assume that g++ or the GNU libstdc++ actually works with uClibc without trying - I don't recall seeing it mentioned in the uClibc pages.
Also, if this is an embedded system, think of its performance, compute power, overall complexity, and timing/simplicity/solidity requirements. Take into consideration the complexity involved, and think whether using C++ and threads is appropriate in your embedded system, and if nothing else in the system uses those, whether it is worth introducing for that library. It may be, not knowing the library or system I cannot tell (again, embedded systems being such a wide range nowadays).
And in this case also, just a quick link I stumbled upon looking for uClibc -- if you are working on an embedded system, using uClibc, and want to use C++ code on it -- take a look at uClibc++. I do not know how much of the standard C++ stuff you need and it already supports, and it seems to be an ongoing project, so not clear if it is in a state good enough for you already, but assuming that your work is also under development still, it might be a good alternative to GCC's libstdc++ for your embedded work.
I think this guy explains quite well why that wouldn't make sense. C++ code that uses your shared object but a different libstdc++ would link alright, but wouldn't work.

Resources