How to restrict access to symbols in shared object? - linux

I have a plug-in in the form of a shared library (bar.so) that links into a larger program (foo). Both foo and bar.so depend on the same third party library (baz) but they need to keep their implementations of baz completely separate. So when I link foo (using the supplied object files and archives) I need it to ignore any use of baz in bar.so and vice versa.
Right now if I link foo with --trace-symbol=baz_fun where baz_fun is one of the offending symbols I get the following output:
bar.so: definition of baz_fun
foo/src.a(baz.o): reference to baz_fun
I believe this is telling me that foo is referencing baz_fun from bar.so (and execution of foo confirms this).
Solutions that I have tried:
Using objcopy to "localize" the symbols of interest: objcopy --localize-symbols=local.syms bar.so where local.syms contains all of the symbols of interest. I think I might just be confused here and maybe "local" doesn't mean what I think it means. Regardless, I get the same output from the link above. I should note that if I run the nm tool on bar.so prior to using objcopy all of the symbols in question have the T flag (upper-case indicating global) and after objcopy they have a t indicating they are local now. So it appears I am using objcopy correctly.
Compiling with -fvisibility=hidden however due to some other constraints I need to use GCC 3.3 which doesn't appear to support that feature. I might be able to upgrade to a newer version of GCC but would like confirmation that compiling with this flag will help me before heading down that road.
Other things to note:
I do not have access to the source code of either foo or baz
I would prefer to keep all of my plug-in in one shared object (bar.so). baz is actually a licensing library so I don't want it separated

Use dlopen to load your plugin with RTLD_DEEPBIND flag.
(edit)
Please note that RTLD_DEEPBIND is Linux-specific and need glibc 2.3.4 or newer.

Related

How to embed a static library into a shared library?

On linux I am trying to create a shared library, libbar.so, that embeds a commercial static library (licensing is fine). The commercial library has 4 versions: libfoo-seq.a, libfoo-mt.a, libfoo-seq.so, and libfoo-mt.so (they all provide the same symbols, just the code is sequential/multi-threaded, and the lib is static/shared). Of these four I want my code always to use the sequential foo library, so when I create libbar.so I link together my object files and libfoo-seq.a.
The problem is that the users of my library may have already pulled in libfoo-mt.so by the time they pull in my libbar.so, thus all symbols from libfoo are already present by the time libbar.so is read in, so my calls to the functions in foo are resolved to the multithreaded version.
I wonder how can I resolve this issue? What kind of magic flags do I need to use when I compile to create my object files and when I link my object files with libfoo-seq.a to create libbar.so?
You can hide libfoo's symbols in libbar via version script:
$ cat libbar.map
{
global: libbar_*;
local: libfoo_*;
};
$ gcc ... -o libbar.so -Wl,--version-script=libbar.map

How do I properly link to a Rust dynamic library from C?

I have a Cargo project named foo that produces a libfoo-<random-hash>.dylib. How do I link against it? clang only finds it if I rename it to libfoo.dylib but libfoo-<random-hash>.dylib is needed for running the program. My workaround is to manually copy and rename the lib but this can’t be the proper procedure.
My procedure to compile and run the C program is:
$ cargo build
$ cp target/debug/libfoo-*.dylib target/debug/libfoo.dylib
$ clang -lfoo -L target/debug -o bar bar.c
$ LD_LIBRARY_PATH=target/debug ./bar
The hash serves a purpose: It gives a link error in case of a version mismatch (especially important with cross-crate inlining). It also allows one binaries to link to two versions of a crate (e.g., because your application uses two libraries that internally use different versions). But in the C world, nobody cares about that. If you don't care either, or care more about making it easy to link to your library, just rename the shared object to get rid of the hash.
Note that the hash is added by Cargo. If you build by directly invoking rustc, no hash is added — but nobody wants that. AFAIK Cargo doesn't offer an option to omit the hash, but you might want to try asking the developers or filing an issue.
Alternatively, you could extract the full library name (including hash) from the file name, and instruct clang to link to that. Then you get the above advantages, but it makes the build process of everyone who wants to link to your library more complicated.
Since Rust 1.11, Cargo supports the
cdylib
artifact type for this exact purpose:
[lib]
name = "foo"
crate-type = ["rlib", "cdylib"]
This definition yields a ./target/{release,debug}/libfoo.so with
all the Rust dependencies statically linked in so it can be linked into C
projects right away.

A workaround for the “Template Haskell + C” bug?

I've got the following situation:
Library X is a wrapper over some code in C.
Library A depends on library X.
Library B uses Template Haskell and depends on library A.
GHC bug #9010 makes it impossible to install library B using GHC 7.6. When TH is processed, GHCi fires up and tries to load library X, which fails with a message like
Loading package charsetdetect-ae-1.0 ... linking ... ghc:
~/.cabal/lib/x86_64-linux-ghc-7.6.3/charsetdetect-ae-1.0/
libHScharsetdetect-ae-1.0.a: unknown symbol `_ZTV15nsCharSetProber'
(the actual name of the “unknown symbol” differs from machine to machine).
Are there any workarounds for this problem (apart from “don't use Template Haskell”, of course)? Maybe library X has to be compiled differently, or there's some way to stop it from loading (as it shouldn't be called during code generation anyway)?
This is really one of the main reasons that 7.8 switched to dynamic GHCi by default. Rather than try to support every feature of every object file format, it builds dynamic libraries and lets the system dynamic loader handle them.
Try building with the g++ option -fno-weak. From the g++ man page:
-fno-weak
Do not use weak symbol support, even if it is provided by the linker. By default, G++ will use weak symbols if they are available. This option exists only for testing, and should not be used by end-users; it will result in inferior code and has no benefits. This option may be removed in a future release of G++.
There is another issue with __dso_handle. I found that you can at least get the library to load and apparently work by linking in a file which defines that symbol. I don't know whether this hack will cause anything to go wrong.
So in X.cabal add
if impl(ghc < 7.8)
cc-option: -fno-weak
c-sources: cbits/dso_handle.c
where cbits/dso_handle.c contains
void *__dso_handle;

Is there an automated way to figure out Shared Object Dependencies?

Short:
I'm looking for something that will list all unresolved dependencies in an SO, taking into account the SOs that are in it's dependencies.
Long:
I'm converting a lot of static-compiled code to Shared Objects in Linux- is there an easy way to determine what other SOs my recently compiled SO is dependent on besides trial & error while trying to load it?
I'm sure there is a better way, but I haven't been able to find it yet.
I've found "ldd", but that only lists what the SO says it's dependent on.
I've also used "nm" to figure out once an SO fails to load to verify what other SO contains it.
I don't have code for you, but I can give pointers:
It's just a graph problem. You should use objdump -T to dump the dynamic symbol table for a given binary or shared object. You'll see many lines of output, and the flags can be a little confusing, but the important part if that symbols will either be *UND* or they'll have a segment name (.text etc).
Any time you see *UND*, that means that it's an undefined symbol which has to be resolved. Defined symbols are the targets of resolution.
With that, and a little Python, you should be able to find what you need.
"ldd -r foo.so" should print the set of symbols which foo.so needs but which aren't provided by its direct dependencies.
Alternatively, link foo.so like this:
gcc -shared -o foo.so foo.o bar.o -ldep1 -ldep2 -Wl,--no-undefined
This should fail (to link) if foo.o or bar.o uses something not provided by libdep1 or libdep2 or libc.

Restricting symbols to local scope for linux executable

Can anyone please suggest some way we can restrict exporting of our symbols to global symbol table?
Thanks in advance
Hi,
Thanks for replying...
Actually I have an executable which is statically linked to a third party library say "ver1.a" and also uses a third party ".so" file which is again linked with same library but different version say "ver2.a". Problem is implementation of both these versions is different. At the beginning, when executable is loaded, symbols from "ver1.a" will get exported to global symbol table. Now whenever ".so" is loaded it will try to refer to symbols from ver2.a, it will end up referring to symbols from "ver1.a" which were previously loaded.Thus crashing our binary.
we thought of a solution that we wont be exporting the symbols for executable to Global symbol table, thus when ".so" gets loaded and will try to use symbols from ver2.a it wont find it in global symbol table and it will use its own symbols i.e symbols from ver2.a
I cant find any way by which i can restrict exporting of symbols to global symbol table. I tried with --version-script and retain-symbol-file, but it didn't work. For -fvisibility=hidden option, its giving an error that " -f option may only be used with -shared". So I guess, this too like "--version-script" works only for shared libraries not for executable binaries.
code is in c++, OS-Linux, gcc version-3.2. It may not be possible to recompile any of the third party libraries and ".so"s. So option of recompiling "so' file with bsymbolic flag is ruled out.
Any help would be appreciated.
Pull in the 3rd party library with dlopen.
You might be able to avoid that by creating your own shared lib that hides all the third party symbols and only exposes your own API to them, but if all else fails dlopen gives you complete control.
I had, what sounds like, a similar issue/question: Segfault on C++ Plugin Library with Duplicate Symbols
If you can rebuild the 3rd party library, you could try adding the linker flag -Bsymbolic (the flag to gcc/g++ would be -Wl,-Bsymbolic). That might solve your issue. It all depends on the organization of your code and stuff, as there are caveats to using it:
http://www.technovelty.org/code/c/bsymbolic.html
http://software.intel.com/en-us/articles/performance-tools-for-software-developers-bsymbolic-can-cause-dangerous-side-effects/
If you can't rebuild it, according to the first caveat link:
In fact, the only thing the -Bsymbolic
flag does when building a shared
library is add a flag in the dynamic
section of the binary called
DT_SYMBOLIC.
So maybe there's a way to add the DT_SYMBOLIC flag to the dynamic section post-linking?
The simplest solution is to rename the symbols (by changing source code) in your executable so they don't conflict with the shared library in the first place.
The next simplest thing is to localize the "problem" symbols with 'objcopy -L problem_symbol'.
Finally, if you don't link directly with the third party library (but dlopen it instead, as bmargulies suggests), and none of your other shared libraries use of define the "problem" symbol, and you don't link with -rdynamic or one of its equivalents, then the symbol should not be exported to the dynamic symbol table of the executable, and thus you shouldn't have a conflict.
Note: 'nm a.out' will still, show the symbol as globally defined, but that doesn't matter for dynamic linking. You want to look at the dynamic symbol table of a.out with 'nm -D a.out'.

Resources