GCC: Specifying a Minimum Shared Library Version - linux

Background
I inherited and maintain a Linux shared library that is very closely coupled with specific hardware; let's call it libfoo.so.0.0.0. This library has been around for some time and "just worked". This library has now become a dependency for several higher-layer applications.
Now, unfortunately, new hardware designs have forced me to create symbols with wider types, thereby resulting in libfoo.so.0.1.0. There have been only additions; no deletions or other API changes. The original, narrow versions of the updated symbols still exist in their original form.
Additionally, I have an application (say, myapp) that depends on libfoo. It was originally written to support the 0.0.0 version of the library but has now been reworked to support the new 0.1.0 APIs.
For backwards compatibility reasons, I would like to be able to build myapp for either the old or new library via a compile flag. The kernel that a given build of myapp will be loaded on will always have exactly one version of the library, known at compile time.
The Question
It is very likely that libfoo will be updated again in the future.
When building myapp, is it possible to specify a minimum version of libfoo to link against based on a build flag?
I know it is possible to specify the library name directly on the build CLI. Will this cause myapp to require exactly that version or will later versions of the lib with the same major revision still be able to link against it (ex. libfoo.so.0.2.0)? I am really hoping to not have to update every dependent app's build each time a new minor version is released.
Is there a more intelligent way of accomplishing this in an application-agnostic way?
References
How do you link to a specific version of a shared library in GCC

You are describing external library versioning, where the app is built against libfoo.so.0, libfoo.so.1, etc. Documentation here.
Using external library versioning requires that exactly the same version of libfoo.so.x be present at runtime.
This is generally not the right technique on Linux, which, through the magic of symbol versioning, allows a single libfoo.so.Y to provide multiple incompatible definitions of the same symbol, and thus allows a single library serve both the old and the new applications simultaneously.
In addition, if you are simply always adding new symbols, and are not modifying existing symbols in incompatible way, then there is no reason to increment the external version. Keep libfoo.so at version 0, provide a int foo_version_X_Y; global variable in it (as well as all previous versions: foo_version_1_0, foo_version_1_1, etc.), and have an application binary read the variable that it requres. If an application requires a new symbol foo_version_1_2 and is run with an old library that only provides foo_version_1_1, then the application will fail to start with an obvious error.

Related

How to detect missing symbols in shared library with libtool

As stated, I want to be able to check that a shared library, created by libtool, is not missing any symbols,
I have written a library that is built as a shared library, 'A'. It depends in turn on another library 'B'.
The other library 'B' does not follow strict semver, and so sometimes introduces new functions in minor or patch releases.
Although I try to put appropriate #if B_LIB_VERSION >= 42 in the code for my library to not attempt to call a function in library B if it is not going to be available, apparently I sometimes get the version incorrect. This causes an error when the program is run.
Is it possible with libtool, or any other tool, to ask it to produce a list of all the symbols that are not found in a shared library, or any of the libraries that it will load?
As stated, I want to be able to check that a shared library, created by libtool, is not missing any symbols,
That's hard to do with shared libraries, as they are designed to allow for late symbol resolution. If you're not using dlopen type features, you might be able to build static executables from static versions of A and B and look for missing symbols.
The other library 'B' does not follow strict semver, and so sometimes introduces new functions in minor or patch releases.
I'd seriously consider searching for a replacement library, than having to keep on dealing with their dependency issues.
Is it possible with libtool, or any other tool, to ask it to produce a list of all the symbols that are not found in a shared library, or any of the libraries that it will load?
No, not really. nm will give you a list of symbols that are undefined (and referenced) in a shared library. objdump might be of some use also. On linux, ldd might do some of what you want. But generally there is no way of knowing exactly what a shared library loads, even without considering dlopen.
libltdl might be of some use also if you have to stick with the misbehaving library. At least you can figure out at runtime if libB.42 has symbol xyz or not. It's not as easy as the conditional code way of doing things.

Determining binary compatibility under linux

What is the best way to determine a pre-compiled binary's dependencies (specifically in regards to glibc and libstdc++ symbols & versions) and then ensure that a target system has these installed?
I have a limitation in that I cannot provide source code to compile on each machine (employer restriction) so the defacto response of "compile on each machine to ensure compatibility" is not suitable. I also don't wish to provide statically compiled binaries -> seems very much a case of using a hammer to open an egg.
I have considered a number of approaches which loosely center around determining the symbols/libraries my executable/library requires through use of commands such as
ldd -v </path/executable>
or
objdump -x </path/executable> | grep UND
and then somehow running a command on the target system to check if such symbols, libraries and versions are provided (not entirely certain how I do this step?).
This would then be followed by some pattern or symbol matching to ensure the correct versions, or greater, are present.
That said, I feel like this will already have been largely done for me and I'm suffering from ... "a knowledge gap ?" of how it is currently implemented.
Any thoughts/suggestions on how to proceed?
I should add that this is for the purposes of installing my software on a wide variety of linux distributions - in particular customised clusters - which may not obey distribution guidelines or standardised packaging methods. The objective being a seamless install.
I wish to accomplish binary compatibility at install time, not at a subsequent runtime, which may occur by a user with insufficient privileges to install dependencies.
Also, as I don't have source code access to all the third party libraries I use and install (specialised maths/engineering libraries) then an in-code solution does not work so well. I suppose I could write a binary that tests whether certain symbols (&versions) are present, but this binary itself would have compatibility issues to run.
I think my solution has to be to compile against older libraries (as mentioned) and install this as well as using the LSB checker (looks promising).
GNU libraries (glibc and libstdc++) support a mechanism called symbol versioning. First of all, these libraries export special symbols used by dynamic linker to resolve the appropriate symbol version (CXXABI_* and GLIBCXX_* in libstdc++, GLIBC_* in glibc). A simple script to the tune of:
nm -D libc.so.6 | grep " A "
will return a list of version symbols which can then be further shell processed to establish the maximum supported libc interface version (same works for libstdc++). From a C code, one has the option to do the same using dlvsym() (first dlopen() the library, then check whether certain minimal version of the symbols you need can be looked up using dlvsym()).
Other options for obtaining a glibc version in runtime include gnu_get_libc_version() and confstr() library calls.
However, the proper use of versioning interface is to write code which explicitly links to a specific glibc/libstdc++ library version. For example, code linking to GLIBC_2.10 interface version is expected to work with any glibc version newer than 2.10 (all versions up to 2.18 and beyond). While it is possible to enable versioning on a per symbol basis (using a ".symver" assembler/linker directive) the more reasonable approach is to set up a chroot environment using older (minimal supported) version of the toolchain and compile the project against it (it will seamlessly run with whatever newer version encountered).
Use Linux Application Checker tool ([1], [2], [3]) to check binary compatibility of your application with various Linux distributions. You can also check compatibility with your custom distribution by this tool.

Finding the shared library name to use with dlload

In my open-source project Artha I use libnotify for showing passive desktop notifications to the user.
Instead of statically linking libnotify, a lookup at runtime is made for the shared object (.so) file via dlload, if available on the target machine, Artha exposes the notification feature in it's GUI. On app. start, a call to dlload with filename param as libnotify.so.1 is made and if it returns a non-null pointer, then the feature is exposed.
A recurring problem with this model is that every time the version number of the library is bumped, Artha's code needs to be updated, currently libnotify.so.4 is the latest to entail such an occurance.
Is there a linux system call (irrespective of the distro the app. is running on), which can tell me if a particular library's shared object is available at runtime? I know that there exists the bruteforce option of enumerating the library by going from 1 to say 10, I find the solution ugly and inelegant.
Also, if this can be addressed via autoconf, then that solution is welcome too I.e. at build time, based on the target machine, the configure.h generated should've the right .so name that can be passed to dlload.
P.S.: I think good distros follow the style of creating links to libnotify.so.x so that a programmer can just do dlload("libnotify.so", RTLD_LAZY) and the right version numbered .so is loaded; unfortunately not all distros follow this, including Ubuntu.
The answer is: you don't.
dlopen() is not designed to deal with things like that, and trying to load whichever soversion you find on the system just because it happens to have the symbols you need is not a good way to do it.
Different sonames have different ABIs, and different ABIs means that you may be calling the same exact symbol name that is expecting a different set (or different size) of parameters, which will cause crashes or misbehaviour that are extremely difficult do debug.
You should have a read on how shared object versions work and what an ABI is.
The libfoo.so link is there for the link editor (ld) and is usually installed with the -devel packages for that reason; it might also very well not be a link but rather a text file with a linker script, often times on purpose to avoid exactly what you're trying to do.

What are the pro and cons of statically linking a library?

I want to release an application I developed as a hobby both for Linux and Windows. This application depends on boost (and possibly other libraries). The norm for this kind of application (a chess engine) is to provide only an executable file and possibly some helper files.
I tough it would be a good idea to statically link the libraries so the executable would not have any dependencies. So the end user can just put the executable in a directory and start using it.
However, while doing some research online I found some negative comments about statically linking libraries, some even arguing that an application with statically linked libraries would be hardly portable, meaning that it would only run on my system of highly similar systems.
So what are the pros and cons of statically linking library?
I already know that the executable will be bigger. But I can't see why it would make my application less portable.
Pros:
No dependencies.
Cons:
Higher memory usage, as the OS can no longer use a shared copy of the library.
If the library needs to be updated, your application needs to be rebuilt. This is doubly important for libraries that then have security fixes.
Of course, a bigger issue for portability is the lack of source code distribution.
Let's say the static library "A" you include has a dependency on function "B". If this dependency can't be fulfilled by the target system, then your program won't run.
But if you're using dynamic linking, the user could maybe install another version of library "A" that uses function "C" instead of "B", so it can run successfully.
If you link the libraries statically, unless you add the smarts to also check the user's system for the libraries you've linked, you're locking your application to use those versions of the libraries until you update your executable. Security holes happen, and updates happen. (For a chess engine there may not be too much issue, but who knows.)
With dynamically linked libraries, if the library say X, you have linked with is not available at the user system, your code crashes ungracefully leaving the end user wondering.
Whereas, in the case of static libraries everything is fused into the executable, so a condition like above mayn't happen, the executable however will be very bulky.
The above problem in dynamically linked libraries can however, be eliminated by dynamic loading.

Loading multiple shared libraries with different versions

I have an executable on Linux that loads libfoo.so.1 (that's a SONAME) as one of its dependencies (via another shared library). It also links to another system library, which, in turn, links to a system version, libfoo.so.2. As a result, both libfoo.so.1 and libfoo.so.2 are loaded during execution, and code that was supposed to call functions from library with version 1 ends up calling (binary-incompatible) functions from a newer system library with version 2, because some symbols stay the same. The result is usually stack smashing and a subsequent segfault.
Now, the library which links against the older version is a closed-source third-party library, and I can't control what version of libfoo it compiles against. Assuming that, the only other option left is rebuilding a bunch of system libraries that currently link with libfoo.so.2 to link with libfoo.so.1.
Is there any way to avoid replacing system libraries wiith local copies that link to older libfoo? Can I load both libraries and have the code calling correct version of symbols? So I need some special symbol-level versioning?
You may be able to do some version script tricks:
http://sunsite.ualberta.ca/Documentation/Gnu/binutils-2.9.1/html_node/ld_26.html
This may require that you write a wrapper around your lib that pulls in libfoo.so.1 that exports some symbols explicitly and masks all others as local. For example:
MYSYMS {
global:
foo1;
foo2;
local:
*;
};
and use this when you link that wrapper like:
gcc -shared -Wl,--version-script,mysyms.map -o mylib wrapper.o -lfoo -L/path/to/foo.so.1
This should make libfoo.so.1's symbols local to the wrapper and not available to the main exe.
I can only come up with a work-around. Which would be to statically link a version of the "system library" that you are using. For your static build, you could make it link against the same old version as the third-party library. Given that it does not rely on the newer version...
Perhaps it is also possible to avoid these problems with not linking to the third-party library the ordinary way. Instead, your program could load it at execution time. Perhaps then it could be shadowed against the rest. But I don't know much about that.

Resources