GNU/Debian Linux and LD - linux

Lets say I have a massive project which consists of multiple dynamic libraries that will all get installed to /usr/lib or /usr/lib64. Now lets say that one of the libraries call into another of the compiled libraries. If I place both of the libraries that are dependent on eachother in the same location will the ld program be able to allow the two libraries to call eachother?

The answer is perhaps yes, but it is a very bad design to have circular references between two libraries (i.e. liba.so containing function fa, calling function fb from libb.so, calling function ga from liba.so).
You should merge the two libraries in one libbig.so. And don't worry, libraries can be quite big. (some corporations have Linux libraries of several hundred megabytes of code).
The gold linker from package binutils-gold on Debian should be useful to you. It works faster than the older linker from binutils.

Yes, as long as their location is present in set of directories ld searches for libraries in. You can override this set by using LD_LIBRARY_PATH enviroment variable.
See this manual, it will resolve your questions.

If you mean the runtime dynamic linker /lib/ld-linux* (as opposed to /usr/bin/ld), it will look for libraries in your LD_LIBRARY_PATH, which typically includes /usr/lib and /usr/lib64.
In general, /lib/ld-* are used for .so libraries at run-time; /usr/bin/ld is used for .a libraries at compile-time.
However, if your libraries are using dlopen() or similar to find one another (e.g. plug-ins), they may have other mechanisms for finding one another. For example, many plug-in systems will use dlopen to read every library in a certain (one or many) directory/ies.

Related

What to do when two shared libraries which uses different versions of the same 3rd party library?

I have a process A which uses two shared libraries: libA.so and libB.so. Because the two libraries were written by different people. Unfortunately libA.so uses version 1.0 of the 3rd party library libD.so. While libB.so uses version 2.0 of the library in static form libD.a. I know that if libA.so and libA.so use libD.so, some errors might happen because of the Global Symbol Interpose. But does this situation have the same problem too?
I know the link flag -Bsymbolic could be used on libA.so or libB.so to force the symbol resolving symbols with the library first. In order to make process A run correctly, both of the two libraries must be linked with this flag, am I right? However, I don't have the source code of libA.so. So I cannot re-link the libA.so again.
To be more general, if one process uses two 3rd party libraries, which contains another same 3rd party library. Will the same thing happen? Is there anything I can do to solve this problem?
This may or may not help you, but given the lack of information I'm hoping it at least sparks an idea or leads you to something similar.
This is an application that allows you to alter your shell settings on a per directory basis:
https://github.com/zimbatm/direnv
It sounds like you actually have an issue that would require you to recompile one of your libraries from source though. That's not ideal, but if there is no build using a compatible thirdparty version you might seek a completely different library to accomplish the original task.

A question about how loader locates libraries at runtime

Only a minimum amount of work is done
at compile time by the linker; it only
records what library routines the
program needs and the index names or
numbers of the routines in the
library. (source)
So it means ld.so won't check all libraries in its database,only those recorded by the application programe itself, that is to say, only those specified by gcc -lxxx.
This contradicts with my previous knowledge that ld.so will check all libraries in its database one by one until found.
Which is the exact case?
I will make a stab at answering this question...
At link time the linker (not ld.so) will make sure that all the symbols the .o files that are being linked together are satisfied by the libraries the program is being linked against. If any of those libraries are dynamic libraries, it will also check the libraries they depend on (no need to include them in the -l list) to make sure that all of the symbols in those libraries are satisfied. And it will do this recursively.
Only the libraries the executable directly depends on via supplied -l parameters at link time will be recorded in the executable. If the libraries themselves declared dependencies, those dependencies will not be recorded in the executable unless those libraries were also specified with -l flags at link time.
Linking happens when you run the linker. For gcc, this usually looks something like gcc a.o b.o c.o -lm -o myprogram. This generally happens at the end of the compilation process. Underneath the covers it generally runs a program called ld. But ld is not at all the same thing as ld.so (which is the runtime loader). Even though they are different entities they are named similarly because they do similar jobs, just at different times.
Loading is the step that happens when you run the program. For dynamic libraries, the loader does a lot of jobs that the linker would do if you were using static libraries.
When the program runs, ld.so (the runtime loader) actually hooks the symbols up on the executable to the definitions in the shared library. If that shared library depends on other shared libraries (a fact that's recorded in the library) it will also load those libraries and hook things up to them as well. If, after all this is done, there are still unresolved symbols, the loader will abort the program.
So, the executable says which dynamic libraries it directly depends upon. Each of those libraries say which dynamic libraries they directly depend upon, and so forth. The loader (ld.so) uses that to decide which libraries to look in for symbols. It will not go searching through random other libraries in a 'database' to find the appropriate symbols. They must be in libraries that are in the dependency chain.

What LD stand for on LD_LIBRARY_PATH variable on *unix?

I know that LD_LIBRARY_PATH is a environment variable where the linker will look for the shared library (which contains shared objects) to link with the executable code.
But what does the LD Stands for, is it for Load? or List Directory?
Linker. The *nix linker is called ld. When a program with dynamic libraries is linked, the linker adds additional code to look for dynamic libraries to resolve symbols not statically linked. Usually this code looks in /lib and /usr/lib. LD_LIBRARY_PATH is a colon separated list of other directories to search.
"ldd" is a handy program to see where the libraries are: try "ldd /bin/ls", for example.
It could also mean "Loader", though. ;-)
Editorial:
As a (semi) interesting side-note: I think dynamic libraries will go away someday. They were needed when disk space and system memory was scarce. There is a performance hit to using them (i.e. the symbols need to be resolved and the object code edited). In these modern days of 3GB memory and 7 second bootup times, it might be appropriate to go back to static linking.
Except for the fact that every C++ program would magically grow to 3MB. ;-)
LD_LIBRARY_PATH - stands for LOAD LIBRARY PATH or some times called as LOADER LIBRARY PATH

Any downsides to using statically linked applications on Linux?

I seen several discussions here on the subject, but wanted to ask about my particular situation:
If I have some 3rd part libraries which my application is using, and I'd like to link them together in order to save myself the hassle in LD_LIBRARY, etc., is there any downside to it on Linux, other then larger file size?
Also, is it possible to statically link only some libraries, and other (standard Linux libraries) to link dynamically?
Thanks.
It is indeed possible to dynamically link against some libraries and statically link against others.
It sounds like what you really want to do is dynamically link against the system libraries, and statically link against the nonstandard ones that a user may not have installed (or that different users may have different installations of).
That's perfectly reasonable.
It's not generally a good idea to statically link against system libraries, especially libc.
It can often make sense to statically link against libraries that do not come with the OS and that will not be distributed with your application.
There are some bits of libc - those that use nsswitch - that need to load libraries dynamically. This can cause problems if you want to produce a completely static binary.
Statically linking your 3rd party libraries into your application should be completely fine.
The statically linked binary will be larger than if you had uses a shared library, but I find that disadvantage outweight the library path hassles, provided I control the distribution of all the libraries involved. If you are dependant on a particular distros shared libraries, then you have no choice but to use dynamic linking.
The main disadvantage I see is your application loses any automatic bugfixes that might be applied to a shared library. On the flip-side you don't get new bugs.
Static linking does not just affect the file size of the library, it also affects the memory footprint and start up time of the application. Dynamically linked libraries are loaded once no matter how many programs use them. Statically linked libraries must be loaded once per program that uses them (because they are now part of that program).
To answer your second question, yes, it is possible to have dynamic and static libraries linked to the same application. Just be careful to avoid interlibrary dependencies so you don't have a problem with library order. You should be able to list the libraries in any arbitrary order. Where I work, we prefer to list them alphabetically.
Edit: To link a static library, use the flag -lfoo. To add a directory to the library search path, use -L/path/to/libfoo.
Edit: You don't have to link a dynamic library. Your program can use a function provided by your compiler to open a dynamic library at run time, or you can link it at compile time and the compiler will resolve the symbols but not include them in the binary. See pjc50's comment below.
Statically linking will make your binary bulky, but you wont need to have a shared version of that library on the target runtime environment. This is especially the case while developing embedded apps.

How to create a shared object that is statically linked with pthreads and libstdc++ on Linux/gcc?

How to create a shared object that is statically linked with pthreads and libstdc++ on Linux/gcc?
Before I go to answering your question as it was described, I will note that it is not exactly clear what you are trying to achieve in the end, and there is probably a better solution to your problem.
That said - there are two main problems with trying to do what you described:
One is, that you will need to decompose libpthread and libstdc++ to the object files they are made with. This is because ELF binaries (used on Linux) have two levels of "run time" library loading - even when an executable is statically linked, the loader has to load the statically linked libraries within the binary on execution, and map the right memory addresses. This is done before the shared linkage of libraries that are dynamically loaded (shared objects) and mapped to shared memory. Thus, a shared object cannot be statically linked with such libraries, as at the time the object is loaded, all static linked libraries were loaded already. This is one difference between linking with a static library and a plain object file - a static library is not merely glued like any object file into the executable, but still contains separate tables which are referred to on loading. (I believe that this is in contrast to the much simpler static libraries in MS-DOS and classic Windows, .LIB files, but there may be more to those than I remember).
Of course you do not actually have to decompose libpthread and libstdc++, you can just use the object files generated when building them. Collecting them may be a bit difficult though (look for the objects referred to by the final Makefile rule of those libraries). And you would have to use ld directly and not gcc/g++ to link, to avoid linking with the dynamic versions as well.
The second problem is consequential. If you do the above, you will sure have such a shared object / dynamic library as you asked to build. However, it will not be very useful, as once you try to link a regular executable that uses those libpthread/libstdc++ (the latter being any C++ program) with this shared object, it will fail with symbol conflicts - the symbols of the static libpthread/libstdc++ objects you linked your shared object against will clash with the symbols from the standard libpthread/libstdc++ used by that executable, no matter if it is dynamically or statically linked with the standard libraries.
You could of course then try to either hide all symbols in the static objects from libstdc++/libpthread used by your shared library, make them private in some way, or rename them automatically on linkage so that there will be no conflict. However, even if you get that to work, you will find some undesireable results in runtime, since both libstdc++/libpthread keep quite a bit of state in global variables and structures, which you would now have duplicate and each unaware of the other. This will lead to inconsistencies between these global data and the underlying operating system state such as file descriptors and memory bounds (and perhaps some values from the standard C library such as errno for libstdc++, and signal handlers and timers for libpthread.
To avoid over-broad interpretation, I will add a remark: at times there can be sensible grounds for wanting to statically link against even such basic libraries as libstdc++ and even libc, and even though it is becoming a bit more difficult with recent systems and versions of those libraries (due to a bit of coupling with the loader and special linker tricks used), it is definitely possible - I did it a few times, and know of other cases in which it is still done. However, in that case you need to link a whole executable statically. Static linkage with standard libraries combined with dynamic linkage with other objects is not normally feasible.
Edit: One issue which I forgot to mention but is important to take into account is C++ specific. C++ was unfortunately not designed to work well with the classic model of object linkage and loading (used on Unix and other systems). This makes shared libraries in C++ not really portable as they should be, because a lot of things such as type information and templates are not cleanly separated between objects (often being taken, together with a lot of actual library code at compile time from the headers). libstdc++ for that reason is tightly coupled with GCC, and code compiled with one version of g++ will in general only work with the libstdc++ from with this (or a very similar) version of g++. As you will surely notice if you ever try to build a program with GCC 4 with any non-trivial library on your system that was built with GCC 3, this is not just libstdc++. If your reason for wanting to do that is trying to ensure that your shared object is always linked with the specific versions of libstdc++ and libpthread that it was built against, this would not help because a program that uses a different/incompatible libstdc++ would also be built with an incompatible C++ compiler or version of g++, and would thus fail to link with your shared object anyway, aside from the actual libstdc++ conflicts.
If you wonder "why wasn't this done simpler?", a general rumination worth pondering: For C++ to work nicely with dynamic/shared libraries (meaning compatibility across compilers, and the ability to replace a dynamic library with another version with a compatible interface without rebuilding everything that uses it), not just compiler standartization is needed, but at the level of the operating system's loader, the structure and interface of object and library files and the work of the linker would need to be significantly extended beyond the relatively simple Unix classics used on common operating systems (Microsoft Windows, Mach based systems and NeXTStep relatives such as Mac OS, VMS relatives and some mainframe systems also included) for natively built code today. The linker and dynamic loader would need to be aware of such things as templates and typing, having to some extent functionality of a small compiler to actually adapt the library's code to the type given to it - and (personal subjective observation here) it seems that higher-level intermediate intermediate code (together with higher-level languages and just-in-time compilation) is catching ground faster and likely to be standardized sooner than such extensions to the native object formats and linkers.
You mentioned in a separate comment that you are trying to port a C++ library to an embedded device. (I am adding a new answer here instead of editing my original answer here because I think other StackOverflow users interested in this original question may still be interested in that answer in its context)
Obviously, depending on how stripped down your embedded system is (I have not much embedded Linux experience, so I am not sure what is most likely), you may of course be able to just install the shared libstdc++ on it and dynamically link everything as you would do otherwise.
If dynamically linking with libstdc++ would not be good for you or not work on your system (there are so many different levels of embedded systems that one cannot know), and you need to link against a static libstdc++, then as I said, your only real option is static linking the executable using the library with it and libstdc++. You mentioned porting a library to the embedded device, but if this is for the purpose of using it in some code you write or build on the device and you do not mind a static libstdc++, then linking everything statically (aside from perhaps libc) is probably OK.
If the size of libstdc++ is a problem, and you find that your library is actually only using a small part of its interfaces, then I would nonetheless suggest first trying to determine the actual space you would save by linking against only the parts you need. It may be significant or not, I never looked that deep into libstdc++ and I suspect that it has a lot of internal dependencies, so while you surely do not need some of the interfaces, you may or may not still depend on a big part of its internals - I do not know and did not try, but it may surprise you. You can get an idea by just linking a binary using the library against a static build of it and libstdc++ (not forgetting to strip the binary, of course), and comparing the size of the resulting executable that with the total size of a (stripped) executable dynamically linked together with the full (stripped) shared objects of the library and libstdc++.
If you find that the size difference is significant, but do not want to statically link everything, you try to reduce the size of libstdc++ by rebuilding it without some parts you know that you do not need (there are configure-time options for some parts of it, and you can also try to remove some independent objects at the final creation of libstdc++.so. There are some tools to optimize the size of libraries - search the web (I recall one from a company named MontaVista but do not see it on their web site now, there are some others too).
Other than the straightforward above, some ideas and suggestions to think of:
You mentioned that you use uClibc, which I never fiddled with myself (my experience with embedded programming is a lot more primitive, mostly involving assembly programming for the embedded processor and cross-compiling with minimal embedded libraries). I assume you checked this, and I know that uClibc is intended to be a lightweight but rather full standard C library, but do not forget that C++ code is hardly independent on the C library, and g++ and libstdc++ depend on quite some delicate things (I remember problems with libc on some proprietary Unix versions), so I would not just assume that g++ or the GNU libstdc++ actually works with uClibc without trying - I don't recall seeing it mentioned in the uClibc pages.
Also, if this is an embedded system, think of its performance, compute power, overall complexity, and timing/simplicity/solidity requirements. Take into consideration the complexity involved, and think whether using C++ and threads is appropriate in your embedded system, and if nothing else in the system uses those, whether it is worth introducing for that library. It may be, not knowing the library or system I cannot tell (again, embedded systems being such a wide range nowadays).
And in this case also, just a quick link I stumbled upon looking for uClibc -- if you are working on an embedded system, using uClibc, and want to use C++ code on it -- take a look at uClibc++. I do not know how much of the standard C++ stuff you need and it already supports, and it seems to be an ongoing project, so not clear if it is in a state good enough for you already, but assuming that your work is also under development still, it might be a good alternative to GCC's libstdc++ for your embedded work.
I think this guy explains quite well why that wouldn't make sense. C++ code that uses your shared object but a different libstdc++ would link alright, but wouldn't work.

Resources