Managing secondary dependencies of shared libraries

Managing secondary dependencies of shared libraries - shared-libraries

I have a problem with the seconday dependences of the shared libraries.
Suppose we have the following dependency tree:
libA.so
├─libB.so
└─libC.so
└─libD.so
That is, a shared library A depends on shared libraries B and C, and shared library C depends itself on shared library D. A good example for libC is GSL shared library which depends on CBLAS.
To make a self-sufficient easy-to-use package and to avoid problems with different versions of installed shared libraries, I package the libraries libB, libC and libD with libA, and as explained, e.g., in Drepper’s article, “How to write shared libraries” (2011), I add the link flag -Wl,-rpath,\$ORIGIN --enable-new-dtags to set the RUN_PATH of libA.so, so that the dynamic loader can find the dependencies of libA in the corresponding directories, without needing to set LD_LIBRARY_PATH (which has its downsides).
The problem is that setting the RUN_PATH in libA does not help with the secondary dependencies, like libD.so.
Turning on the dynamic loader debug messages (via setting LD_DEBUG) reveals that the loader attempts to find libD.so only in the standard library locations, and not in the location of libA (contrary to what it does for finding libB or libC).
Is there a way to overcome this problem?
Indeed, I can link the libraries C and D statically, if I have the source code or properly compiled static libraries.
But is there a better way?
Solution:
As explained by Employed Russian the solution is to set RPATH instead of RUNPATH.
With recent versions of GCC, one should use the following link flag:
-Wl,--disable-new-dtags,-rpath,\$ORIGIN
RPATH is searched for transitive dependencies; that is, paths in RPATH will be considered for everything that is dynamically loaded, even dependencies of dependencies.
In contrast, the ld dynamic linker does not search RUNPATH locations for transitive dependencies (unlike RPATH).
See also:
Wikipedia: https://en.wikipedia.org/wiki/Rpath#The_role_of_GNU_ld
SO question: How to set RPATH and RUNPATH with GCC/LD?

By setting DT_RUNPATH you are telling the loader that each of your binaries is linked with all of its dependencies.
But that isn't true for libC.so -- it (apparently) doesn't have DT_RUNPATH of its own.
I can link the libraries C and D statically, ... But is there a better way?
Yes: link libC.so with correct DT_RUNPATH (if libC.so and libD.so are in the same directory, then -Wl,-rpath,\$ORIGIN will work for libC.so as well).
Update:
The problem is that I may not have the source or the properly compiled object files of libC
In that cause you should use RPATH instead of RUNPATH. Unlike the latter, the former applies to the object itself and to all dependencies of that object.
In other words, using --enable-new-dtags is wrong in this case -- you want the opposite.
In this case, there is no solution (besides static linking); correct?
The other solution (besides using RPATH) is to set LD_LIBRARY_PATH in the environment.
Update 2:
Difference between RPATH and RUNPATH
The difference is explained in the ld.so man page:
If a shared object dependency does not contain a slash, then it
is searched for in the following order:
o Using the directories specified in the DT_RPATH dynamic
section attribute of the binary if present and DT_RUNPATH
attribute does not exist. Use of DT_RPATH is deprecated.
...
o Using the directories specified in the DT_RUNPATH dynamic
section attribute of the binary if present. Such directories
are searched only to find those objects required by DT_NEEDED
(direct dependencies) entries and do not apply to those
objects' children, which must themselves have their own
DT_RUNPATH entries. This is unlike DT_RPATH, which is applied
to searches for all children in the dependency tree.

Related

Are linux package manager installed libraries statically or dynamically linked?

If cryptopp as an example is installed using sudo apt install libcrypto++-dev and then included using #include <cryptopp/base64.h>, will this library be statically or dynamically linked?
CMakeLists.txt includes cryptopp in target_link_libraries.

will [a library installed via a package manager] be statically or dynamically linked?
It depends on several factors. First, both libraries must be available. In the case of Crypto++ on Unix and Linux, both static and dynamic libraries are available. On Windows, only a static library is provided.
Second, the linker's configuration matters assuming both libraries are available. On Linux with ld, the dynamic library is always used by default. On OS X, a dynamic library is also always used by default. On Windows the linker configuration does not factor into things because options control it.
Third, it depends on linker options. On Windows - and if a dynamic library was available - it would depend on the library you link to. It would be either the static on the import library for a dynamic link library. On Linux with ld you can use :filename to link with the static library:
-l namespec
--library=namespec
Add the archive or object file specified by namespec to the list of
files to link. This option may be used any number of times. If
namespec is of the form :filename, ld will search the library path for
a file called filename, otherwise it will search the library path for
a file called libnamespec.a.
On systems which support shared libraries, ld may also search for
files other than libnamespec.a. Specifically, on ELF and SunOS
systems, ld will search a directory for a library called
libnamespec.so before searching for one called libnamespec.a. (By
convention, a .so extension indicates a shared library.) Note that
this behavior does not apply to :filename, which always specifies a
file called filename.
The linker will search an archive only once, at the location where it
is specified on the command line. If the archive defines a symbol
which was undefined in some object which appeared before the archive
on the command line, the linker will include the appropriate file(s)
from the archive. However, an undefined symbol in an object appearing
later on the command line will not cause the linker to search the
archive again.
Finally, behavior is not a simple matter when using CMake. The default behavior will likely be to not add anything. Adding -lcryptopp or -l:cryptopp to your LDFLAGS or LDLIBS will have no effect because CMake does not honor customary flags. You will have to add the library to every target manually.

How does a process find dynamic shared libraries in Linux?

I was reading APUE and see following:
Several more segment types exist in an a.out, containing the symbol table, debugging information, linkage tables for dynamic shared libraries, and the like. These additional sections don’t get loaded as part of the program’s image executed by a process.
But, how does a process find necessary .so if linkage tables for dynamic shared libraries is not loaded to the program's image?

The dynamic linker is loaded as the program interpreter through the .interp (PT_INTERP) section added to the program headers during linking.
The dynamic linker reads the DT_NEEDED ELF tags (added during static linking) to figure out what shared libraries that it needs to resolve.
Finally, on resolving the library deps, from ld.so(8):
When resolving library dependencies, the dynamic linker first inspects
each dependency string to see if it contains a slash (this can occur
if a library pathname containing slashes was specified at link time).
If a slash is found, then the dependency string is interpreted as a
(relative or absolute) pathname, and the library is loaded using that
pathname.
If a library dependency does not contain a slash, then it is searched
for in the following order:
(ELF only) Using the directories specified in the DT_RPATH dynamic
section attribute of the binary if present and DT_RUNPATH attribute
does not exist. Use of DT_RPATH is deprecated.
Using the environment variable LD_LIBRARY_PATH. Except if the
executable is a set-user-ID/set-group-ID binary, in which case it is
ignored.
(ELF only) Using the directories specified in the DT_RUNPATH dynamic
section attribute of the binary if present.
From the cache file /etc/ld.so.cache, which contains a compiled list
of candidate libraries previously found in the augmented library path.
If, however, the binary was linked with the -z nodeflib linker option,
libraries in the default library paths are skipped. Libraries
installed in hardware capability directories (see below) are preferred
to other libraries.
In the default path /lib, and then /usr/lib. If the binary was linked
with the -z nodeflib linker option, this step is skipped.

Explanation sought: libtool, automake, shared libraries (and Fortran)

The problem I had is solved. I'm posting this to solicit an explanation as to why the solution actually works. I've gotten great feedback here before.
I have a legacy code base that used a very simplistic build system, and my project is to migrate that to Autotools for customization and, particularly, building shared libraries. The main library is written in C, but must also be linkable from Fortran (for legacy purposes), and is distributed with some test codes in F77. The authors organized the source code into modules...
src_module1/
src_module2/
...
testc/
testf77/
Their built the library lib/libmain.a by compiling code in the src_*/ directories and archiving the objects with ranlib.
My first approach was to build a shared library from each src_*/ separately and "link" all of these into one shared library. Using Autotools, the src_module1/Makefile.am would contain
noinst_LTLIBRARIES = libmodule1.la
libmodule1_la_SOURCES = ...
and so on for the other modules, and finally the lib/Makefile.am would need only:
lib_LTLIBRARIES = libmain.la
libmain_la_SOURCES =
libmain_la_LIBADD = $(top_srcdir)/src_module1/libmodule1.la ...
This seemed to work perfectly. However, when the code in testc/ was compiled and linked against libmain.la, a "symbols not found" error was issued.
Thinking that this was an issue with libtool or shared libraries, I tried building static only, basically changing all .la to .a and all _LTLIBRARIES to _LIBRARIES. Same problem. This time, however, noticing the error "ranlib: warning for library: libmain.a the table of contents is empty (no object file members in the library define global symbols)" when trying to link libmain.a itself.
The solution that I found seems like a hack. I did not build Makefiles for any of the src_*/ directories, but instead used only for the lib/ directory and its Makefile.am had the lines:
lib_LTLIBRARIES = libmain.la
libmain_la_SOURCES = [all sources from all ../src_modules/ ]
This worked. The compiled programs in testc/ linked against libmain.la without issue. One of the "modules" is a set of Fortran bindings that wrap other C functions in the library. Even the Fortran codes in testf77/ linked against libmain.la properly.
Could someone carefully explain what happens when libtool builds a shared library? Or even when building a static library? Why is it that several static libraries can't be linked together to make one static library? Why are symbols only available when libtool/ranlib builds the library "from sources"? And what about installing a shared/static library, i.e. moving it to the /usr/local/lib --- what happens there? The Wikipedia article on static and shared libraries isn't really detailed enough for me.
I do appreciate all efforts to make sense of my longwinded question.

What you first tried ought to work. I am using this kind of setup all the time (in a C++ context). It's also documented, and part of the Automake test suite (although maybe not with Fortran).
A libtool library that is not installable, i.e., one declared with noinst_LTLIBRARIES, is called a libtool convenience library. That noinst_ makes a big difference in what is built. Even if Libtool is configured to build shared libraries, a libtool convenience library is not actually a shared library: it is just a set of object files (compiled as PIC so that they can be latter be used in a shared library) stored in an archive. You can use a libtool convenience library anywhere using this set of objects would make sense, e.g., to build a shared library.
When multiple libtool convenience libraries are LIBADDed to an installable libtool library (such as your libmain.la), Libtool has to unpack the archives containing the objects of each convenience library and link them into the final library.
There is a trap that is worth noting here: when building a shared library out of
convenience libraries, if the _SOURCES variable is empty Automake does not know which linker to use and default to the C linker. If you want to trick Automake into using the linking rule for some specific language, you can declare a nodist_EXTRA_..._SOURCES source file that do not have to exist. (See the Libtool Convenience Libraries section of the Automake manual for an example.)
Maybe that was your problem? If you have some Fortran files in the sources of some of your modules (your description suggests these are only C files), the Fortran linker will be used to build libmain.la only if a Fortran file appears in the source files declared for that libtool library. And the C linker will be used when libmain_la_SOURCES is empty.
Otherwise, I have no idea why it didn't work.
There is an small error in your example:
libmain_la_LIBADD = $(top_srcdir)/src_module1/libmodule1.la
should be
libmain_la_LIBADD = $(top_builddir)/src_module1/libmodule1.la
because the library is not created in the source directory. However I assume this is just a typo, and you won't see the difference unless you do a VPATH build or run make distcheck.
Your second try, using _LIBRARIES without Libtool is not expected to work.
_LIBRARIES can only be used to declare static archives, and in this case _LIBADD may only contain object files, not other static archives. Unpacking an archive to reuse its objects into another archive can be tricky to do portably. Automake's answer to this problem has always been: install Libtool and use _LTLIBRARIES (Libtool can be configured to build only static libraries).

making gcc prefer static libs to shared objects when linking?

When linking against libraries using the -l option (say -lfoo), gcc will prefer a shared object to a static library if both are found (will prefer libfoo.so to libfoo.a). Is there a way to make gcc prefer the static library, if both are found?
The issue I'm trying to solve is the following: I'm creating a plugin for an application (the flight simulator called X-Plane), with the following constraints:
the plugin is to be in the form of a 32 bit shared object, even when running on a 64 bit system
the running environment does not provide a convenient way to load shared objects which are not in the 'normal' locations, say /usr/lib or /usr/lib32:
one cannot expect the user to set LD_PRELOAD or LD_LIBRARY_PATH to find shared objects shipped with my plugin
the X-Plane running environment would not add my plugins directory to ``LD_LIBRARY_PATH, before dynamically loading the plugin shared object, which would allow me to ship all my required shared objects alongside my plugin shared object
one cannot expect 64 bit users to install 32 bit shared objects that are non-trivial (say, are not included in the ia32-libs package on ubuntu)
to solve the above constraints, a possible solution is to link the generated shared object against static, 32 bit versions of all non-trivial libraries used. but, when installing such libraries, usually both static and dynamic versions are installed, and thus gcc will always link against the shared object instead of the static library.
of course, moving / removing / deleting the shared objects in question, and just leaving the static libraries in say /usr/lib32, is a work-around, but it is not a nice one
note:
yes, I did read up on how to link shared objects & libraries, and I'm not trying to creatae a 'totally statically linked shared object'
yes, I tried -Wl,-static -lfoo -Wl,-Bdynamic, but didn't bring the expected results
yes, I tried -l:libfoo.a as well, but this didn't bring the expected results either

You can specify the full path to the static libs without the -l flag to link with those.
gcc ... source.c ... /usr/lib32/libmysuperlib.a ...

Just add the .a file to the link line without -l as if it were a .o file.

It's dated, but may work: http://www.network-theory.co.uk/docs/gccintro/gccintro_25.html
(almost end of the page)
"As noted earlier, it is also possible to link directly with individual library files by specifying the full path to the library on the command line."

Restricting symbols to local scope for linux executable

Can anyone please suggest some way we can restrict exporting of our symbols to global symbol table?
Thanks in advance
Hi,
Thanks for replying...
Actually I have an executable which is statically linked to a third party library say "ver1.a" and also uses a third party ".so" file which is again linked with same library but different version say "ver2.a". Problem is implementation of both these versions is different. At the beginning, when executable is loaded, symbols from "ver1.a" will get exported to global symbol table. Now whenever ".so" is loaded it will try to refer to symbols from ver2.a, it will end up referring to symbols from "ver1.a" which were previously loaded.Thus crashing our binary.
we thought of a solution that we wont be exporting the symbols for executable to Global symbol table, thus when ".so" gets loaded and will try to use symbols from ver2.a it wont find it in global symbol table and it will use its own symbols i.e symbols from ver2.a
I cant find any way by which i can restrict exporting of symbols to global symbol table. I tried with --version-script and retain-symbol-file, but it didn't work. For -fvisibility=hidden option, its giving an error that " -f option may only be used with -shared". So I guess, this too like "--version-script" works only for shared libraries not for executable binaries.
code is in c++, OS-Linux, gcc version-3.2. It may not be possible to recompile any of the third party libraries and ".so"s. So option of recompiling "so' file with bsymbolic flag is ruled out.
Any help would be appreciated.

Pull in the 3rd party library with dlopen.
You might be able to avoid that by creating your own shared lib that hides all the third party symbols and only exposes your own API to them, but if all else fails dlopen gives you complete control.

I had, what sounds like, a similar issue/question: Segfault on C++ Plugin Library with Duplicate Symbols
If you can rebuild the 3rd party library, you could try adding the linker flag -Bsymbolic (the flag to gcc/g++ would be -Wl,-Bsymbolic). That might solve your issue. It all depends on the organization of your code and stuff, as there are caveats to using it:
http://www.technovelty.org/code/c/bsymbolic.html
http://software.intel.com/en-us/articles/performance-tools-for-software-developers-bsymbolic-can-cause-dangerous-side-effects/
If you can't rebuild it, according to the first caveat link:
In fact, the only thing the -Bsymbolic
flag does when building a shared
library is add a flag in the dynamic
section of the binary called
DT_SYMBOLIC.
So maybe there's a way to add the DT_SYMBOLIC flag to the dynamic section post-linking?

The simplest solution is to rename the symbols (by changing source code) in your executable so they don't conflict with the shared library in the first place.
The next simplest thing is to localize the "problem" symbols with 'objcopy -L problem_symbol'.
Finally, if you don't link directly with the third party library (but dlopen it instead, as bmargulies suggests), and none of your other shared libraries use of define the "problem" symbol, and you don't link with -rdynamic or one of its equivalents, then the symbol should not be exported to the dynamic symbol table of the executable, and thus you shouldn't have a conflict.
Note: 'nm a.out' will still, show the symbol as globally defined, but that doesn't matter for dynamic linking. You want to look at the dynamic symbol table of a.out with 'nm -D a.out'.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string