Dynamic linking in a shared object .so file

Dynamic linking in a shared object .so file - linux

I want to use an SDK/library that uses glibc2.14. My machine has glibc2.12. I installed glibc2.14 in a separate location. Used the SDK in my executable by using compile option --rpath and it works good.
Now, I want to use the SDK (that uses glibc2.14) in a shared object binary (.so). I tried --rpath and --dynamic-linker options but the shared object is not loaded and it gives me an error during runtime -
/lib64/libc.so.6: version ``GLIBC_2.14'' not found (required by /usr/local/lib/libsdk.so.1).
How do I make the shared object binary look at the glibc2.14?

Now, I want to use the SDK (that uses glibc2.14) in a shared object binary (.so).
You can't.
As this answer explains, ld-linux and libc.so must come from the same build of GLIBC.
Since the ld-linux is determined by the main executable (is hard-coded into it at static link time), it will not match your custom libc.so.6 no matter what you do to compile your .so.
In addition, specifying --dynamic-linker while building .so is pointless: it is the job of ld-linux to load the .so, so by definition ld-linux must already be loaded before any .so is loaded into the process.
P.S. If it were possible to make your .so use a different libc.so.6, the result would be an (almost) immediate crash, as libc.so.6 is not designed to work when multiple copies are loaded into the same process.
Update:
upgrade my OS which I don't think is possible because I am developing the software for a client and getting them to upgrade is not going to be easy. Second would be to ask the SDK supplier to recompile with glibc2.12.
GLIBC-2.14 was released 9 years ago. It's not unreasonable for your supplier to "only" support 2.14 (and later), and it is somewhat unreasonable for your client to run such an old OS.
I think you have 3rd possible option: have the client install newer GLIBC in parallel, and build their main executable with --rpath and --dynamic-linker flags (as you have done). Their binary will then have no problem loading your SDK.

Related

What is the role of program interpreters in executable files?

I was going through disassembly of elf executables and understanding the elf format. In there, I saw lib64/ld-linux-x86-64.so.2 used as program interpreter in the generated executable.
My guess is: I had used printf in the source code, which had to be dynamically linked. When I checked through dynamic section, I was able to find a reference to libc.so.6 shared library (tag:DT_NEEDED). In my system, I found multiple files with that name in different directories:
sourav#ubuntu-VirtualBox:/$ sudo find / -name libc.so.6
/usr/lib/x86_64-linux-gnu/libc.so.6
find: ‘/run/user/1000/doc’: Permission denied
find: ‘/run/user/1000/gvfs’: Permission denied
/snap/snapd/13170/lib/x86_64-linux-gnu/libc.so.6
/snap/snapd/11107/lib/x86_64-linux-gnu/libc.so.6
/snap/core18/1988/lib/i386-linux-gnu/libc.so.6
/snap/core18/1988/lib/x86_64-linux-gnu/libc.so.6
/snap/core18/2128/lib/i386-linux-gnu/libc.so.6
/snap/core18/2128/lib/x86_64-linux-gnu/libc.so.6
So, I guess purpose of program interpreter is to resolve these names to the proper libraries and load them during execution. Is this correct?
It seems, we can also have executables with no program interpreter (which is the case for program interpreter itself). In that case, does system/os itself loads the shared library? If so, how does it resolves the path of library?
Is it possible to generate executable with no program interpreter using gcc? My gcc version is 'gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04)'.

So, I guess purpose of program interpreter is to resolve these names to the proper libraries and load them during execution. Is this correct?
Yes, but that that's a bit minimalistic. Loading dynamic libraries involves locating them, loading or mapping them into memory if necessary, and resolving dynamic symbols within, possibly lazily, for multiple kinds of relocations. It involves recursively loading the libraries' own needed libraries. Also, in a dynamically linked executable, the program interpreter provides the program entry point (from the kernel's perspective), so it is also responsible for setting up and entering the program-specific entry point (for example, main() in a C or C++ program).
It seems, we can also have executables with no program interpreter (which is the case for program interpreter itself). In that case, does system/os itself loads the shared library? If so, how does it resolves the path of library?
You can have ELF executables without a program interpreter, but they are not dynamically linked, at least not in the ELF sense. There are no shared libraries to load, and certainly the system does not load any.
Is it possible to generate executable with no program interpreter using gcc? My gcc version is 'gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04)'.
If you have static versions of all needed libraries available then you should be able to achieve that by including the -static option on the command line when you link the program. It is entirely possible, however, that you do not have the needed static libraries, even if libc is the only library you need.

Create a shared library that subsumes its link-time library dependencies

I am trying to package some native libraries for inclusion into a java natives .jar. Right now, we are targeting 32-bit and 64-bit linux and windows, with macosx upcoming (which would yield a total of 6 variations). In addition, we have some naming problems which would be resolved if we could roll up several small libraries into one big one.
My goal is to convert
my_library.so dependencyA-55.so dependencyB-50.so
into
my_library_without_dependencies.so
I have full (C and C++) sources for dependencyA and dependencyB; however, I would much rather not to meddle in their compilation, as it is quite complex (ffmpeg). I am trying to pull this off using gcc 4.6 (ubuntu 12.04 64-bit), and the solution, if found, should ideally work for 64-bit and 32-bit linux, and 64-bit and 32-bit windows architectures (cross-compiling via mingw32).
Is there any magic combination of linker options that would cause GCC to subsume the dependencies into a single final shared library?. I have looked intently at the linker options without success, and related SO questions do not address this use-case.

Its not possible.
Shared objects are already a product of linker and in the form of ready to execute.
Instead you can create static libraries as "dependencyA.a" and "dependencyB.a"
( as you have source code ) and use "--whole-archive" linker switch while creating "my_library.so"

In ELF library filenames, how important are the major and minor versions with regards to compatibility?

I have a collection of binaries I installed on a Linux machine. They require the libgfortran library, but on execution display the following message:
error while loading shared libraries: libgfortran.so.2: cannot open shared object file: No such file or directory
The machine already had libgfortran installed, but the name of the library file was libgfortran.so.1.0.0 (and libgfortran.so.1 linked to it).
To my surprise, by simply making a symbolic link libgfortran.so.2 to libgfortran.so.1, as follows:
ln -s /usr/lib64/libgfortran.so.1 /usr/lib64/libgfortran.so.2
this solved my problem and the binaries were able to run, apparently without error.
My question is - why did they even run at all?
Is there not an inbuilt mechanism to detect when the API version is different, or is it only based on the filename?
If there is API detection - then should there not have been a symbol error?
Indeed, what is the purpose of having different major versions between libraries if they are in fact compatible?
(Note to answerers: my question is not about libgfortran in particular, this is just an illustrative example.)

why did it run at all?
It ran because all the dependent symbols were found in the .so that it loaded
is there an inbuilt mechanism to detect API version differences?
There is symbol versioning support available, but you have to program to it. It totally depends on if the developer makes use of it.
If there is API detection ...?
Again, there is symbol versioning available, which is not quite the same.
What's the purpose of having different major versions if it's still compatible
It's up to the developer.
Note however that it may be that only the elements of the API you made use of were compatible between the two versions. There is every possibility that the code is silently corrupting your data in the background and you won't become aware of it until later down the line.

Minor versions are normally incremented for releases that are ABI compatible with the previous versions. I.e. an application can and will use a new minor version shared library when the latter is updated.
Major versions are normally incremented for releases that are not ABI compatible with the previous major versions.
Normally, you have three filesystem names for one shared library, e.g.:
lrwxrwxrwx 1 max max 13 May 13 11:13 libfix.so -> libfix.so.0.0
lrwxrwxrwx 1 max max 13 May 13 11:13 libfix.so.0 -> libfix.so.0.0
-rwxrwxr-x 1 max max 1665544 May 13 11:13 libfix.so.0.0
The unversioned libfix.so is just a symlink to the fully versioned one. It is used when you link your application with -lfix using ld linker.
libfix.so.0.0 is the actual shared library. However, this library is linked with -hlibfix.so.0 linker option:
-h name
-soname=name
When creating an ELF shared object, set the internal DT_SONAME
field to the specified name. When an executable is linked with a
shared object which has a DT_SONAME field, then when the executable
is run the dynamic linker will attempt to load the shared object
specified by the DT_SONAME field rather than the using the file
name given to the linker.
So, what happens when the application starts is that runtime linker ld.so actually looks for libfix.so.0, which is a symbolic link to the latest version of the shared library with the same major number. When libfix.so.0.0 is updated to, say, libfix.so.0.1 those symbolic links are updated to point to that new version. Existing and new applications start to use that new version of the shared library next time you start or link them.

How to build the elf interpreter (ld-linux.so.2/ld-2.17.so) as static library?

I apologize if my question is not precise because I don't have a lot
of Linux related experience. I'm currently building a Linux from
scratch (mostly following the guide at linuxfromscratch.org version
7.3). I ran into the following problem: when I build an executable it
gets a hardcoded path to something called ELF interpreter.
readelf -l program
shows something like
[Requesting program interpreter: /lib/ld-linux.so.2]
I traced this library ld-linux-so.2 to be part of glibc. I am not very
happy with this behaviour because it makes the binary very unportable
- if I change the location of /lib/ld-linux.so.2 the executable no
longer works and the only "fix" I found is to use the patchelf utility
from NixOS to change the hardcoded path to another hardcoded path. For
this reason I would like to link against a static version of the ld
library but such is not produced. And so this is my question, could
you please explain how could I build glibc so that it will produce a
static version of ld-linux.so.2 which I could later link to my
executables. I don't fully understand what this ld library does, but I
assume this is the part that loads other dynamic libraries (or at
least glibc.so). I would like to link my executables dynamically, but
I would like the dynamic linker itself to be statically built into
them, so they would not depend on hardcoded paths. Or alternatively I
would like to be able to set the path to the interpreter with
environment variable similar to LD_LIBRARY_PATH, maybe
LD_INTERPRETER_PATH. The goal is to be able to produce portable
binaries, that would run on any platform with the same ABI no matter
what the directory structure is.
Some background that may be relevant: I'm using Slackware 14 x86 to
build i686 compiler toolchain, so overall it is all x86 host and
target. I am using glibc 2.17 and gcc 4.7.x.

I would like to be able to set the path to the interpreter with environment variable similar to LD_LIBRARY_PATH, maybe LD_INTERPRETER_PATH.
This is simply not possible. Read carefully (and several times) the execve(2), elf(5) & ld.so(8) man pages and the Linux ABI & ELF specifications. And also the kernel code doing execve.
The ELF interpreter is responsible for dynamic linking. It has to be a file (technically a statically linked ELF shared library) at some fixed location in the file hierarchy (often /lib/ld.so.2 or /lib/ld-linux.so.2 or /lib64/ld-linux-x86-64.so.2)
The old a.out format from the 1990s had a builtin dynamic linker, partly implemented in old Linux 1.x kernel. It was much less flexible, and much less powerful.
The kernel enables, by such (in principle) arbitrary dynamic linker path, to have various dynamic linkers. But most systems have only one. This is a good way to parameterize the dynamic linker. If you want to try another one, install it in the file system and generate ELF executables mentioning that path.
With great pain and effort, you might make your own ld.so-like dynamic linker implementing your LD_INTERPRETER_PATH wish, but that linker still has to be an ELF shared library sitting at some fixed location in the file tree.
If you want a system not needing any files (at some predefined, and wired locations, like /lib/ld.so, /dev/null, /sbin/init ...), you'll need to build all its executable binaries statically. You may want (but current Linux distributions usually don't do that) to have a few statically linked executables (like /sbin/init, /bin/sash...) that will enable you to repair a system broken to the point of not having any dynamic linker.
BTW, the /sbin/init -or /bin/sh - path is wired inside the kernel itself. You may pass some argument to the kernel at boot load time -e.g. with GRUB- to overwrite the default. So even the kernel wants some files to be here!
As I commented, you might look into MUSL-Libc for an alternative Libc implementation (providing its own dynamic linker). Read also about VDSO and ASLR and initrd.
In practice, accept the fact that modern Linuxes and Unixes are expecting some non-empty file system ... Notice that dynamic linking and shared libraries are a huge progress (it was much more painful in the 1990s Linux kernels and distributions).
Alternatively, define your own binary format, then make a kernel module or a binfmt_misc entry to handle it.
BTW, most (or all) of Linux is free software, so you can improve it (but this will take months -or many years- of work to you). Please share your improvements by publishing them.
Read also Drepper's Hwo to Write Shared Libraries paper; and this question.

I ran into the same issue. In my case I want to bundle my application with a different GLIBC than comes system installed. Since ld-linux.so must match the GLIBC version I can't simply deploy my application with the according GLIBC. The problem is that I can't run my application on older installations that don't have the required GLIBC version.
The path to the loader interpreter can be modified with --dynamic-linker=/path/to/interp. However, this needs to be set at compile time and therefore would require my application to be installed in that location (or at least I would need to deploy the ld-linux.so that goes with my GLIBC in that location which goes against a simple xcopy deployment.
So what's needed is an $ORIGIN option equivalent to what the -rpath option can handle. That would allow for a fully dynamic deployment.
Given the lack of a dynamic interpreter path (at runtime) leaves two options:
a) Use patchelf to modify the path before the executable gets launched.
b) Invoke the ld-linux.so directly with the executable as an argument.
Both options are not as 'integrated' as a compiled $ORIGIN path in the executable itself.

Force GCC to static-link e.g. pthreads (and not dynamic link)

My program is built as a loader and many modules which are shared libraries. Now one of those libraries uses pthreads and it seems its bound to the module dynamically (loaded on startup). Now it'd be simplier if i could force pthreads to be linked into the module file. GCC on linux, how do i do? I guess a libpthread.a is necessary....

While linking libpthread.a into a shared library is theoretically possible, it is a really bad idea. The reason is that libpthread is part of glibc, and all parts of glibc must match exactly, or you'll see strange and un-explainable crashes.
So linking libpthread.a into your shared library will:
Cause your program to crash when moved to a machine with a different version of glibc
Cause your existing program to crash when your current machine's glibc is upgraded, but your module is not re-linked against updated libpthread.a.
Spare yourself aggravation, and don't do that.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string