How linux does load libraries in application

How linux does load libraries in application - linux

I'm interested how does linux act in situation that pictured.
You can see library "A" is staticaly linked with application. But this application depends on dynamic library B, and it in turn depends on library A.
So, what library A will use dynamic library B? Staticaly linked library A in my application or it will again load additional library A.
It's important in case if these libraries have different versions.
Also you can suggest me some articles about me, cause for me linker is like black box.

Dynamically linked library B libB.so -which should be dynamically linked, when libB.so is built, to libA.so will not see the statically linked libA.a (even worse, it might have duplicate global variables of that library, so that could give you a nightmare).
Actually, libA.a does not exist in the ELF executable of your main program. Only some but not all object files a*.o from libA.a are statically linked inside your executable (those actually needed).
See Levine Linkers and Loaders book, wikipage on dynamic linking and on ELF, and Drepper's paper How To Write Shared Libraries. See also ld.so(8), ldconfig(8), ldd(1), dlopen(3), mmap(2), proc(5) man pages. Use strace, and try once cat /proc/self/maps ...
In short avoid linking both statically and dynamically the same library (even of similar or different versions).
Rule of thumb: always link dynamically, except when you know what you are doing and why...

Related

Shared library symbol conflicts and static linking (on Linux)

I'm encountering an issue which has been elaborated in a good article Shared Library Symbol Conflicts (on Linux). The problem is that when the execution and .so have defined the same name functions, if the .so calls this function name, it would call into that one in execution rather than this one in .so itself.
Let's talk about the case in this article. I understand the DoLayer() function in layer.o has an external function dependency of DoThing() when compiling layer.o.
But when compiling the libconflict.so, shouldn't the external function dependency be resolved in-place and just replaced with the address of conflict.o/DoThing() statically?
Why does the layer.o/DoLayer() still use dynamic linking to find DoThing()? Is this a designed behavior?

Is this a designed behavior?
Yes.
At the time of introduction of shared libraries on UNIX, the goal was to pretend that they work just as if the code was in a regular (archive) library.
Suppose you have foo() defined in both libfoo and libbar, and bar() in libbar calls foo().
The design goal was that cc main.c -lfoo -lbar works the same regardless of whether libfoo and libbar are archive or a shared libraries. The only way to achieve this is to have libbar.so use dynamic linking to resolve call from bar() to foo(), despite having a local version of foo().
This design makes it impossible to create a self-contained libbar.so -- its behavior (which functions it ends up calling) depends on what other functions are linked into the process. This is also the opposite of how Windows DLLs work.
Creating self-contained DSOs was not a consideration at the time, since UNIX was effectively open-source.
You can change the rules with special linker flags, such as -Bsymbolic. But the rules get complicated very quickly, and (since that isn't the default) you may encounter bugs in the linker or the runtime loader.

Yes, this is a designed behavior. When you link a program into a binary, all the references to named external (non-static) functions are resolved to point into the symbol table for the binary. Any shared libraries that are linked against are specified as DT_NEEDED entries.
Then, when you run the binary, the dynamic linker loads each required shared library to a suitable address and resolves each symbol to an address. Sometimes this is done lazily, and sometimes it is done once at first startup. If there are multiple symbols with the same name, one of them will be chosen by the linker, and your program will likely crash since you may not end up with the right one.
Note that this is the behavior on Linux, which has all symbols as a flat namespace. Windows resolves symbols differently, using a tree topology, which has both advantages (fewer conflicts) and disadvantages (the inability to allocate memory in one library and free it in another).
The Linux behavior is very important if you want things like LD_PRELOAD to work. This allows you to use debugging tools like Electric Fence and CPU profiling tools like the Google performance tools, or replace a memory allocator at runtime. None of these things would work if symbols were preferentially resolved to their binary or shared library.
The GNU dynamic linker does support symbol versions, however, so that it's possible to load multiple versions of a shared library into the same program. Oftentimes distros like Debian will do this with libraries they expect to change frequently, like OpenSSL. If the program uses liba which uses OpenSSL 1.0 and libb which uses OpenSSL 1.1, then the program should still function in such a case since OpenSSL has versioned symbols, and each library will use the appropriate version of the relevant symbol.

What to do when two shared libraries which uses different versions of the same 3rd party library?

I have a process A which uses two shared libraries: libA.so and libB.so. Because the two libraries were written by different people. Unfortunately libA.so uses version 1.0 of the 3rd party library libD.so. While libB.so uses version 2.0 of the library in static form libD.a. I know that if libA.so and libA.so use libD.so, some errors might happen because of the Global Symbol Interpose. But does this situation have the same problem too?
I know the link flag -Bsymbolic could be used on libA.so or libB.so to force the symbol resolving symbols with the library first. In order to make process A run correctly, both of the two libraries must be linked with this flag, am I right? However, I don't have the source code of libA.so. So I cannot re-link the libA.so again.
To be more general, if one process uses two 3rd party libraries, which contains another same 3rd party library. Will the same thing happen? Is there anything I can do to solve this problem?

This may or may not help you, but given the lack of information I'm hoping it at least sparks an idea or leads you to something similar.
This is an application that allows you to alter your shell settings on a per directory basis:
https://github.com/zimbatm/direnv
It sounds like you actually have an issue that would require you to recompile one of your libraries from source though. That's not ideal, but if there is no build using a compatible thirdparty version you might seek a completely different library to accomplish the original task.

GNU/Debian Linux and LD

Lets say I have a massive project which consists of multiple dynamic libraries that will all get installed to /usr/lib or /usr/lib64. Now lets say that one of the libraries call into another of the compiled libraries. If I place both of the libraries that are dependent on eachother in the same location will the ld program be able to allow the two libraries to call eachother?

The answer is perhaps yes, but it is a very bad design to have circular references between two libraries (i.e. liba.so containing function fa, calling function fb from libb.so, calling function ga from liba.so).
You should merge the two libraries in one libbig.so. And don't worry, libraries can be quite big. (some corporations have Linux libraries of several hundred megabytes of code).
The gold linker from package binutils-gold on Debian should be useful to you. It works faster than the older linker from binutils.

Yes, as long as their location is present in set of directories ld searches for libraries in. You can override this set by using LD_LIBRARY_PATH enviroment variable.
See this manual, it will resolve your questions.

If you mean the runtime dynamic linker /lib/ld-linux* (as opposed to /usr/bin/ld), it will look for libraries in your LD_LIBRARY_PATH, which typically includes /usr/lib and /usr/lib64.
In general, /lib/ld-* are used for .so libraries at run-time; /usr/bin/ld is used for .a libraries at compile-time.
However, if your libraries are using dlopen() or similar to find one another (e.g. plug-ins), they may have other mechanisms for finding one another. For example, many plug-in systems will use dlopen to read every library in a certain (one or many) directory/ies.

A question about how loader locates libraries at runtime

Only a minimum amount of work is done
at compile time by the linker; it only
records what library routines the
program needs and the index names or
numbers of the routines in the
library. (source)
So it means ld.so won't check all libraries in its database,only those recorded by the application programe itself, that is to say, only those specified by gcc -lxxx.
This contradicts with my previous knowledge that ld.so will check all libraries in its database one by one until found.
Which is the exact case?

I will make a stab at answering this question...
At link time the linker (not ld.so) will make sure that all the symbols the .o files that are being linked together are satisfied by the libraries the program is being linked against. If any of those libraries are dynamic libraries, it will also check the libraries they depend on (no need to include them in the -l list) to make sure that all of the symbols in those libraries are satisfied. And it will do this recursively.
Only the libraries the executable directly depends on via supplied -l parameters at link time will be recorded in the executable. If the libraries themselves declared dependencies, those dependencies will not be recorded in the executable unless those libraries were also specified with -l flags at link time.
Linking happens when you run the linker. For gcc, this usually looks something like gcc a.o b.o c.o -lm -o myprogram. This generally happens at the end of the compilation process. Underneath the covers it generally runs a program called ld. But ld is not at all the same thing as ld.so (which is the runtime loader). Even though they are different entities they are named similarly because they do similar jobs, just at different times.
Loading is the step that happens when you run the program. For dynamic libraries, the loader does a lot of jobs that the linker would do if you were using static libraries.
When the program runs, ld.so (the runtime loader) actually hooks the symbols up on the executable to the definitions in the shared library. If that shared library depends on other shared libraries (a fact that's recorded in the library) it will also load those libraries and hook things up to them as well. If, after all this is done, there are still unresolved symbols, the loader will abort the program.
So, the executable says which dynamic libraries it directly depends upon. Each of those libraries say which dynamic libraries they directly depend upon, and so forth. The loader (ld.so) uses that to decide which libraries to look in for symbols. It will not go searching through random other libraries in a 'database' to find the appropriate symbols. They must be in libraries that are in the dependency chain.

Shared library internal

I wish to know how the Shared library works I am asking in terms of Symbol table reference.
Is it like when we include a shared library it exports the Symbol table to process then based on some pointers we execute the respective function.
What is the meaning of Shared library Strip ?
Edit :- I wish to know that how shared library works when it get loaded into the memory.\
When a function lets say Fun() get called from the application which has def in library. then how this linking happen . I hope now its clear.

Programs make calls to a shared library through a Procedure Linkage Table, which is filled in by the dynamic linker/loader ld.so based on the information in the dynamic symbol table and relocation entries. On Linux this data is stored in programs and libraries in the ELF format, which you can inspect using programs like objdump and readelf.
This Linux Journal article has a basic overview. For more detailed information check out Ulrich Drepper's excellent paper How To Write Shared Libraries, and the Solaris Linker and Libraries Guide.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string