Compile part of all dependencies as shared libraries - haskell

Say I got (regular source) libraries A and B and executable E which depends on both.
Now, I want E to include the object files of A directly, whereas B should be added as a shared library (concrete use: B contains shared types of a plugin architecture). How would I do that with existing tools, preferably stack?
Is that possible or is it rather an all-or-nothing choice (use only shared libraries or link everything into the same binary)?
Optimally, I'd like to specify for each dependency if it should be linked statically or dynamically. Also, that should probably go into the .cabal file, but we have to work with what we got...
(Well, technically that's both statically linked, but in the second case the object code is split up in different files, you get the idea).

Related

Why are some foreign functions statically linked while others are dynamically linked?

I'm working on a program that needs to manipulate git repositories. I've decided to use libgit2. Unfortunately, the haskell bindings for it are several years out of date and lack several functions that I require. Because of this I've decided to write the portions that use libgit2 in C and call them through the FFI. For demonstration purposes one of them is called git_update_repo.
git_update_repo works perfectly when used in a pure C program, however when it's called from haskell an assertion fails indicating that the libgit2 global init function, git_libgit2_init, hasn't been called. But, git_libgit2_init is called by git_update_repo. And if I use gdb I can see that git_libgit2_init is indeed called and reports that the initialization has been successful.
I've used nm to examine the executables and found something interesting. In a pure C executable, all the libgit2 functions are dynamically linked (as expected). However, in my haskell executable, git_libgit2_init is dynamically linked, while the rest of the libgit2 functions are statically linked. I'm certain that this mismatch is the cause of my issue.
So why do certain functions get linked dynamically and others statically? How can I change this?
The relevant settings in my .cabal file are
cc-options: -g
c-sources:
src/git-bindings.c
extra-libraries:
git2

Removing entry from DYNAMIC section of elf file

I have 3rd party library A, that requires some library B.
A is linked to binrary, which is linked with static version B.
Therefore there's no need in dynamic version of B any more.
A is not under my control and I cannot recompile it. Thus I want to remove NEEDED libA entry from DYNAMIC section of A.
Is there a way to do it with objcopy or other tool?
Is there a way to do it with objcopy or other tool?
I don't know of any existing tool that can do this, although elfsh might be able to.
It is quite trivial to write a C program to do what you want: the .dynamic section of libA.so is a table of fixed-size records (of type ElfW(Dyn)), terminated by an entry with .d_type == DT_NULL. To get rid of a particular DT_NEEDED entry, simply "slide" all following entries up (overwriting entry[n] with entry[n+1], etc.). This will leave your .dynamic with two DT_NULL entries at the end, but nothing should ever care.
One complication is that if libB.so contains versioned symbols that libA.so references, then there will be additional references to libB.so in DT_VERNEED table, and these are more difficult to get rid of. If you don't get rid of VERNEED references, the dynamic linker will fail assertions.

Force mapping between symbols and shared libraries

I have an executable with four shared libraries and the dependency tree look like this: Executable app does a dlopen of foo.so and bar.so. foo.so in turn links to fooHelper.so and bar.so links to barHelper.so.
Now, the issue is that fooHelper.so and barHelper.so have some of the same symbols. For instance, let us say we have a func with different implementations in fooHelper.so and barHelper.so. Is there a way to force foo.so to use fooHelper.so's implementation and bar.so to use barHelper.so's? What happens at present is that depending on the order of linking of the helpers, only one of the implementations of func is used by both foo.so and bar.so. This is because of the default Unix linkage model, if the definition of a symbol is already loaded, then any other definitions from shared libraries loaded subsequently are just discarded. Basically, func will be picked up from the helper library linked first. I need a way to explicitly specify the appropriate mapping without changing the source code of the shared libraries.
I'm working on Linux with g++ 4.4.
Is there a way to force foo.so to use fooHelper.so's implementation and bar.so to use barHelper.so's?
Yes: that's what RTLD_LOCAL is for (when dlopening foo.so and bar.so).
RTLD_LOCAL
This is the converse of RTLD_GLOBAL, and the default if neither flag
is specified. Symbols defined in this library are not made available
to resolve references in subsequently loaded libraries.
If both funcs happen to be in the same name-space, you're in a bit of trouble - if you are programming in C. The term to look for is "function overloading". There have been previous discussions on this topic, e.g. this one:
function overloading in C
EDIT: http://litdream.blogspot.de/2007/03/dynamic-loading-using-dlopen-api-in-c.html

Why should I recompile an entire program just for a library update?

With respect to the following link:
http://www.archlinux.org/news/libpnglibtiff-rebuilds-move-from-testing/
Could someone explain to me why a program should be rebuilt after one of its libraries has been updated?
How does that make any sense since the "main" file is not changed at all?
If the signatures of the functions involved haven't changed, then "rebuilding" the program means that the object files must be linked again. You shouldn't need to compile them again.
An API is contract that describes the interface to the public functions in a library. When the compiler generates code, it needs to know what type of variables to pass to each function, and in what order. It also needs to know the return type, so it knows the size and format of the data that will be returned from the function. When your code is compiled, the address of a library function may be represented as "start of the library, plus 140 bytes." The compiler doesn't know the absolute address, so it simply specifies an offset from the beginning of the library.
But within the library, the contents (that is, the implementations) of the functions may change. When that happens, the length of the code may change, so the addresses of the functions may shift. It's the job of the linker to understand where the entry points of each function reside, and to fill those addresses into the object code to create the executable.
On the other hand, if the data structures in the library have changed and the library requires the callers to manage memory (a bad practice, but unfortunately common), then you will need to recompile the code so it can account for the changes. For example, if your code uses malloc(sizeof(dataStructure)) to allocate memory for a library data structure that's doubled in size, you need to recompile your code because sizeof(dataStructure) will have a larger value.
There are two kinds of compatibility: API and ABI.
API compatibility is about functions and data structures which other programs may rely on. For instance if version 0.1 of libfoo defines an API function called "hello_world()", and version 0.2 removes it, any programs relying on "hello_world()" need updating to work with the new version of libfoo.
ABI compatibility is about the assumptions of how functions and, in particular, data structures are represented in the binaries. If for example libfoo 0.1 also defined a data structure recipe with two fields: "instructions" and "ingredients" and libfoo 0.2 introduces "measurements" before the "ingredients" field then programs based on libfoo 0.1 recipes must be recompiled because the "instructions" and "ingredients" fields will likely be at different positions in the 0.2 version of the libfoo.so binary.
What is a "library"?
If a "library" is only a binary (e.g. a dynamically linked library aka ".dll", ".dylib" or ".so"; or a statically linked library aka ".lib" or ".a") then there is no need to recompile, re-linking should be enough (and even that can be avoided in some special cases)
On the other hand, libraries often consist of more than just the binary object - e.g. the header-files might include some in-line (or macro) logic.
if so, re-linking is not enough, and you might to have to re-compile in order to make use of the newest version of the lib.

Tools that list the prototypes in .so library

Is there a tool(like command) in Linux that list the prototypes in a .so library.
I found nm close to my need, but what I got are just symbols.
If the library is a C library, it does not contain by itself the signature of the functions. These are in the header files (that the library should give), unless the .so library has been compiled with debugging information enabled by -g (which is not usual for production libraries).
Even in C++, the .so library (without -g) don't contain the declaration of involved classes. The mangled names only refer to class or type names...
In short, you need the header files of libraries. Most Linux distributions package them separately from the library itself. For instance, on Debian you have both the libjansson4 package (containing the .soshared library, needed to run applications liked with the Jansson library) and the libjansson-dev package (containing the shared objects and header files useful to build an application calling functions in Jansson library). Debian also provides libjansson-dbg (for the debugging information or variant of the library) and libjansson-doc (for the documentation) packages.
Simple answer: no you cannot do that (for C).
Longer answer:
You can get "prototypes" as you named them ONLY for C++, because functions' declarations are mangled. Mangling really means encoding the whole function signature (or prototype if you like) into one string of characters without spaces, e.g:
CCertificate::GetInfo(Utils::TCertInfo&) const
which is in mangled form:
_ZZNK12CCertificate7GetInfoERN5Utils9TCertInfoEE8
Mangling was intoduced because of function overloading in C++ (functions with the same name but taking different number of parameters and/or different types). In C you do not have oveloading, so the functions are identified (in shared libraries) by name (which is NOT mangled).
To summarise: all functions in shared libraries are identified by name, but for C++ these names are mangled names, for C they are not mangled.
Mangling gives you that additional "side effect" that you can see the function signature (e.g. invoking nm -C).
Hope that helps.

Resources