LD_PRELOAD to override functions from dynamically loaded library - linux

I'm using a wrapper library to trace functions using LD_PRELOAD which does work when the functions I'm tracing are referenced in the application.
The wrapper library uses dlsym to populate symbols it wraps.
But this is not working if the application doesn't reference functions directly but through dlopen.
Should the wrapper library work with dynamically loaded libraries? If not, is there a way to make it work?

But this is not working if the application doesn't reference functions directly but through dlopen.
If the application performs:
void *h = dlopen("libfoo.so", ...);
void *sym = dlsym(h, "symbol");
then the symbol will be resolved to libfoo.so, regardless of any LD_PRELOADs (and indeed regardless of any other instances of symbol in other already loaded libraries). This is working as intended.
Should the wrapper library work with dynamically loaded libraries?
No.
If not, is there a way to make it work?
Yes, you can make it work. You would need to provide a replacement.so which provides all the symbols which the application looks up in libfoo.so, and then make it so the replacement.so is dlopend()ed by the application.
One way to do that is to rename libfoo.so -> libfoo.so.orig, copy replacement.so -> libfoo.so, and have replacement.so itself dlopen("libfoo.so.orig", ...).
If the application dlopens libfoo.so without absolute path, you may be able to arrange so replacement.so is earlier in the search path, e.g.
mkdir /tmp/replacement
ln -s /path/to/replacement.so /tmp/replacement/liboo.so
LD_LIBRARY_PATH=/tmp/replacement /path/to/a.out
If you go that route, replacement.so would still need to know how to dlopen the original libfoo.so (could use hard-coded absolute path for that).

Related

Linux library calls ambiguously named function in executable - is this possible?

I have a problem with an embedded linux C++ application I've written that consists of an executable and a dynamically linked library. The executable calls a function that is one of the entry points in the library, but that function misbehaves. I've investigated using gdb, and find that the library function, which is supposed to make a call to another function xyz() within the library, actually calls a function of the same name xyz()within the executable.
I'm very surprised this can happen, so maybe I'm doing something stupid. Isn't the library linked within itself without reference to the executable? If the executable wrongly made a call to abc() in the library instead of abc() in the executable that would make slightly more sense, because it is at least linked with the library, although in that case would the linker spot the dual definition? Or prioritise the local function?
I could just rename my functions so none of them have matching names, but I'd like to understand what is going on. I don't have much experience in this area, or with the gcc tools. Firstly, is what I think is happening in the above scenario even possible?
Both the executable and the library make calls to another library.
The link command for the library I'm using is:
powerpc-unknown-linux-gnuspe-g++-4.9.3 aaa.o bbb.o [etc] -shared -o libmylibary.so -L ../otherlibpath -Wl,-rpath-link,../otherlibpath -lotherlibname
That is way how the dynamic linker works. The symbols in executable have higher priority then symbols in dynamic libraries. Dynamic library designer must be aware about it. She must do measures to avoid unwanted symbol mismatch. Most libraries use:
In case of C++ use namespaces. All symbols exported from library should be in a library namespace.
In case of C use a name prefix or suffix for all exported symbol. For example OpenSSL library uses the prefix SSL_ and the public functions have names like SSL_set_mode() so the unwanted symbol collision is avoided.
Do not export symbols from the library that are supposed to be private. If the symbol is not exported from the library then the dynamic linker use the local symbol in the library. #pragma visibility is your friend. See https://gcc.gnu.org/wiki/Visibility
If the library with duplicate symbols is a 3rd party library and its author does not follow the recommendations above then you have to rename your function or perhaps ask the author for a library update.
EDIT
Export/do not export may be controlled by #pragma visibility directive (gcc specific extension):
void exported_function1(int);
void exported_function2(int);
#pragma GCC visibility push(hidden)
void private_function1(int);
void private_function2(int);
#pragma GCC visibility pop
Detail at the link above.

When a shared library is loaded, is it possible that it references something in the current binary?

Say I have a binary server, and when it's compiled, it's linked from server.c, static_lib.a, and dynamically with dynamic_lib.so.
When server is executed and it loads dynamic_lib.so dynamically, but on the code path, dynamic_lib.so actually expects some symbols from static_lib.a. What I'm seeing is that, dynamic_lib.so pulls in static_lib.so so essentially I have two static_lib in memory.
Let's assume there's no way we can change dynamic_lib.so, because it's a 3rd-party library.
My question is, is it possible to make dynamic_lib.so or ld itself search the current binary first, or even not search for it in ld's path, just use the binary's symbol, or abort.
I tried to find some related docs about it, but it's not easy for noobs about linkers like me :-)
You can not change library to not load static_lib.so but you can trick it to use static_lib.a instead.
By default ld does not export any symbols from executables but you can change this via -rdynamic. This option is quite crude as it exports all static symbols so for finer-grained control you can use -Wl,--dynamic-list (see example use in Clang sources).

Linking to a custom .a from multiple objects

In our build system, we generate multiple .so files (foo.so, bar.so, ...) that are loaded during runtime by the main executable (biz). So the .so files are linked separately.
We also have our own util.a static library, that has some utility functions and global data.
The problem comes when some of the .so want to use util.a data/function, but we can't link each .so to util.a. It's because of the data section: global data must be unique in the program address space. If more than one .so is linked to util.a and has a copy of the data, the program behavior will be very surprising but hard to debug.
We can't link executable (biz) to util.a either. The linker will not put everything to the target, since biz doesn't reference the functions on behalf of .so.
Of course, unless linking util.a with -Wl,-whole-archive. But is there a better way to do this?
Solution 1: consider making util.a a dynamic library util.so.
Solution 2: don't let the linker export any symbols exported by util.a. When using gcc you can achieve this for example by using __attribute__((visibility("hidden"))):
int __attribute__((visibility("hidden"))) helperfunc(void *p);
You can use objdump to check which symbols are exported.
To answer myself's question, the eventual solution was like:
http://lists.gnu.org/archive/html/qemu-devel/2014-09/msg00099.html
TL;DR: Search for all the interesting symbols (that you want to pull from archives) inside the .so objects with nm (1), and inject into the compiling command line with -Wl,-u,$SYMBOL. Note that the -Wl,-u,$SYMBOL arguments need to come before archive names in the command line, so the linker knows that it needs to link them.

Linux library code injection & calls to identically named functions in SOs

I have built a linux shared object which I inject into a 3rd party program to intercept some dynamic function calls using LD_PRELOAD.
The 3rd party program uses a SO "libabc.so" located at some path. My injected SO uses another SO, also called "libabc.so" located at another path (essentially identical but slight code differences).
My problem is now, that calls to a function "def" which appear in both libabc.so are always resolved by the first. (Presumably because it is loaded first?!) How can I get them to be resolved with the second libabc.so?
Many thanks!
Unless something changed since I used to do this, you will need to dlopen() the library you want to pass calls on to and call the function manually, something like;
handle = dlopen("/path/to/libabc.so", RTLD_LAZY);
otherDef = dlsym(handle, "def");
orderDef(parameter);
There is a complete example how to do this very thing at LinuxJournal.
If you only want to use one libabc.so version, you can always use LD_PRELOAD to load it along with your own shared object before anything else.
If you want to use multiple versions, you have a few alternatives:
Use dlopen() in your shared object to load that library. Since you have created a function injection object you should be familiar with this procedure. This is the more generic and powerful way - you could even mix & match functions from different library versions.
Use a different DT_SONAME for the library version your shared object links against. Unfortunately this requires (slightly) changing the build system of that library and recompiling.
Link your shared object statically against the library in question. Not always possible but it does not require modifying the library in question. The main issue with this alternative is that any change in the library should be followed by a relinking of your shared object for the changes to be pulled in.
Warning: you may need to use a custom linker script or specific linker options to avoid symbol conflicts.

How to check Shared Library exposed functions in program

I am using a third party shared library and I need to check whether a function is exported by shared library programatically.
How to do this. I need this because if function does not exist I need to run some other function locally.
You could probably use dlsym for this.
If you load the library with dlopen, you will use the handle that it returns.
If you're linked against this library you may use special pseudo-handles (10x to caf for pointing it out):
From dlsym man:
There are two special pseudo-handles, RTLD_DEFAULT and RTLD_NEXT. The former will find the first occurrence of the desired symbol using the default library search order. The latter will find the next occurrence of a function in the search order after the current library. This allows one to provide a wrapper around a function in another shared library.
Check the header file of the intended library to get the function signature.
Using dlopen you can load the library dynamically and fetch the symbol if it is exposed in the library with subsequent calls to dlsym and dlclose.
may be you can use objdump command to check all symbol exposed like this
objdump -T libtest.so

Resources