Weak symbols, shared libraries and dlopen - linux

I have a binary with a weak symbol that I want to be able to link at runtime with a run dependent shared library.
$nm testrun
...
w basic2.test
...
My first test was using a .o file at static linktime, that worked, but I need it to be shared.
So, my second test was getting a shared library with that symbol defined and link it at compile time with -lmy (libmy.so), and this, actually worked as well.
Third step tried not linking at compile time and use ld_preload trick and this did not work.
nm libmy.so
...
00000550 T basic2.test
...
I have really no idea why this particular one does not work, looks like dynamic loader should have enough information to set testruns weak symbol with the one in libmy.so.
My final objective, which I guess will require more work is to load at start a small function that does check for the appropiate symbol with dlsym and sets it there.
Any hint?

It seems that you may need to use LD_DYNAMIC_WEAK along with LD_PRELOAD from the man page:
LD_DYNAMIC_WEAK (glibc since 2.1.91) Allow weak symbols to be overridden (reverting to old glibc behavior). For security reasons, since glibc 2.3.4, LD_DYNAMIC_WEAK is ignored for set-user-ID/set-group-ID binaries.
Note: it could be a typo, but you should use -lmylib.so and not -Lmylib.so

Related

When a shared library is loaded, is it possible that it references something in the current binary?

Say I have a binary server, and when it's compiled, it's linked from server.c, static_lib.a, and dynamically with dynamic_lib.so.
When server is executed and it loads dynamic_lib.so dynamically, but on the code path, dynamic_lib.so actually expects some symbols from static_lib.a. What I'm seeing is that, dynamic_lib.so pulls in static_lib.so so essentially I have two static_lib in memory.
Let's assume there's no way we can change dynamic_lib.so, because it's a 3rd-party library.
My question is, is it possible to make dynamic_lib.so or ld itself search the current binary first, or even not search for it in ld's path, just use the binary's symbol, or abort.
I tried to find some related docs about it, but it's not easy for noobs about linkers like me :-)
You can not change library to not load static_lib.so but you can trick it to use static_lib.a instead.
By default ld does not export any symbols from executables but you can change this via -rdynamic. This option is quite crude as it exports all static symbols so for finer-grained control you can use -Wl,--dynamic-list (see example use in Clang sources).

A workaround for the “Template Haskell + C” bug?

I've got the following situation:
Library X is a wrapper over some code in C.
Library A depends on library X.
Library B uses Template Haskell and depends on library A.
GHC bug #9010 makes it impossible to install library B using GHC 7.6. When TH is processed, GHCi fires up and tries to load library X, which fails with a message like
Loading package charsetdetect-ae-1.0 ... linking ... ghc:
~/.cabal/lib/x86_64-linux-ghc-7.6.3/charsetdetect-ae-1.0/
libHScharsetdetect-ae-1.0.a: unknown symbol `_ZTV15nsCharSetProber'
(the actual name of the “unknown symbol” differs from machine to machine).
Are there any workarounds for this problem (apart from “don't use Template Haskell”, of course)? Maybe library X has to be compiled differently, or there's some way to stop it from loading (as it shouldn't be called during code generation anyway)?
This is really one of the main reasons that 7.8 switched to dynamic GHCi by default. Rather than try to support every feature of every object file format, it builds dynamic libraries and lets the system dynamic loader handle them.
Try building with the g++ option -fno-weak. From the g++ man page:
-fno-weak
Do not use weak symbol support, even if it is provided by the linker. By default, G++ will use weak symbols if they are available. This option exists only for testing, and should not be used by end-users; it will result in inferior code and has no benefits. This option may be removed in a future release of G++.
There is another issue with __dso_handle. I found that you can at least get the library to load and apparently work by linking in a file which defines that symbol. I don't know whether this hack will cause anything to go wrong.
So in X.cabal add
if impl(ghc < 7.8)
cc-option: -fno-weak
c-sources: cbits/dso_handle.c
where cbits/dso_handle.c contains
void *__dso_handle;

Linking to a custom .a from multiple objects

In our build system, we generate multiple .so files (foo.so, bar.so, ...) that are loaded during runtime by the main executable (biz). So the .so files are linked separately.
We also have our own util.a static library, that has some utility functions and global data.
The problem comes when some of the .so want to use util.a data/function, but we can't link each .so to util.a. It's because of the data section: global data must be unique in the program address space. If more than one .so is linked to util.a and has a copy of the data, the program behavior will be very surprising but hard to debug.
We can't link executable (biz) to util.a either. The linker will not put everything to the target, since biz doesn't reference the functions on behalf of .so.
Of course, unless linking util.a with -Wl,-whole-archive. But is there a better way to do this?
Solution 1: consider making util.a a dynamic library util.so.
Solution 2: don't let the linker export any symbols exported by util.a. When using gcc you can achieve this for example by using __attribute__((visibility("hidden"))):
int __attribute__((visibility("hidden"))) helperfunc(void *p);
You can use objdump to check which symbols are exported.
To answer myself's question, the eventual solution was like:
http://lists.gnu.org/archive/html/qemu-devel/2014-09/msg00099.html
TL;DR: Search for all the interesting symbols (that you want to pull from archives) inside the .so objects with nm (1), and inject into the compiling command line with -Wl,-u,$SYMBOL. Note that the -Wl,-u,$SYMBOL arguments need to come before archive names in the command line, so the linker knows that it needs to link them.

How to keep a specific symbol in binary file?

I have a static lib (my_static_lib) which I link to an executable binary file. Some of the symbols, but not all, are used in my binary.
A second library, dynamically loaded(my_shared_lib), is expecting to receive some symbols from my_static_lib through symbol injection from the binary. But those symbols are not used by my_binary, so they are stripped off the final bin file.
So, at runtime, my_shared_lib complains that it cannot find __my_stripped_symbols__ and crashes.
Is there a way to force the linker to keep __my_stripped_symbols__? I would prefer something that can be cleanly written in a Makefile.am (autotools)
(-binary file makefile)
-L$(top_builddir)/static_lib -lmy_static_lib --magic-flag-to-keep-stripped-symbol
I do not want to link my_static_lib with my_shared_lib because it will generate strange conflicts in other parts of a rather complex group of executables/shared libraries.
When you link my_static_lib to your application, you want to use the --whole-archive option. It's documented in the ld options docs.
If you're linking with gcc, it looks something like this:
-L$(top_builddir)/static_lib -Wl,-whole-archive -lmy_static_lib -Wl,-no-whole-archive
That will make sure the entire library is kept, and not just the specific functions that your executable uses.
You also need to make sure that the symbols get exported. If the symbols from your static library aren't being exported already, you make need a combination of -fvisibility=hidden and use __attribute__ ((visibility("default"))) to mark up the ones you want exported. You can read a little more about it in the gcc docs

Restricting symbols to local scope for linux executable

Can anyone please suggest some way we can restrict exporting of our symbols to global symbol table?
Thanks in advance
Hi,
Thanks for replying...
Actually I have an executable which is statically linked to a third party library say "ver1.a" and also uses a third party ".so" file which is again linked with same library but different version say "ver2.a". Problem is implementation of both these versions is different. At the beginning, when executable is loaded, symbols from "ver1.a" will get exported to global symbol table. Now whenever ".so" is loaded it will try to refer to symbols from ver2.a, it will end up referring to symbols from "ver1.a" which were previously loaded.Thus crashing our binary.
we thought of a solution that we wont be exporting the symbols for executable to Global symbol table, thus when ".so" gets loaded and will try to use symbols from ver2.a it wont find it in global symbol table and it will use its own symbols i.e symbols from ver2.a
I cant find any way by which i can restrict exporting of symbols to global symbol table. I tried with --version-script and retain-symbol-file, but it didn't work. For -fvisibility=hidden option, its giving an error that " -f option may only be used with -shared". So I guess, this too like "--version-script" works only for shared libraries not for executable binaries.
code is in c++, OS-Linux, gcc version-3.2. It may not be possible to recompile any of the third party libraries and ".so"s. So option of recompiling "so' file with bsymbolic flag is ruled out.
Any help would be appreciated.
Pull in the 3rd party library with dlopen.
You might be able to avoid that by creating your own shared lib that hides all the third party symbols and only exposes your own API to them, but if all else fails dlopen gives you complete control.
I had, what sounds like, a similar issue/question: Segfault on C++ Plugin Library with Duplicate Symbols
If you can rebuild the 3rd party library, you could try adding the linker flag -Bsymbolic (the flag to gcc/g++ would be -Wl,-Bsymbolic). That might solve your issue. It all depends on the organization of your code and stuff, as there are caveats to using it:
http://www.technovelty.org/code/c/bsymbolic.html
http://software.intel.com/en-us/articles/performance-tools-for-software-developers-bsymbolic-can-cause-dangerous-side-effects/
If you can't rebuild it, according to the first caveat link:
In fact, the only thing the -Bsymbolic
flag does when building a shared
library is add a flag in the dynamic
section of the binary called
DT_SYMBOLIC.
So maybe there's a way to add the DT_SYMBOLIC flag to the dynamic section post-linking?
The simplest solution is to rename the symbols (by changing source code) in your executable so they don't conflict with the shared library in the first place.
The next simplest thing is to localize the "problem" symbols with 'objcopy -L problem_symbol'.
Finally, if you don't link directly with the third party library (but dlopen it instead, as bmargulies suggests), and none of your other shared libraries use of define the "problem" symbol, and you don't link with -rdynamic or one of its equivalents, then the symbol should not be exported to the dynamic symbol table of the executable, and thus you shouldn't have a conflict.
Note: 'nm a.out' will still, show the symbol as globally defined, but that doesn't matter for dynamic linking. You want to look at the dynamic symbol table of a.out with 'nm -D a.out'.

Resources