How to generate library with a specific name via cabal - haskell

I am trying to build a shared Haskell library that is used by a C project afterwards. I am on a linux platform so my question is from that context.
Suppose I have a haskell package foo with a library named foo, say version 0.1 which exports some functions via ffi.
I can easily generate a shared library (.so) that I can then link with, but my issue is that the generated library is named libHSfoo-0.1-$COMPONENT_ID.so which makes it quite cumbersome to link with since $COMPONENT_ID is unpredictable as far as I can tell.
The $COMPONENT_ID comes, to the best of my knowledge from the following Cabal structure and it looks like I could write cabal hooks to at least copy the generated shared library, or create a symbolic link to it from a fixed location.
I am wondering whether there is a better way to specify the component-id to get an easily predictable name of the shared library without post-processing?
It seems like I can achieve this if in the configure hook I set the configArgs to just the library component, and the configCID to my desired name of the library, but that seems like a fragile solution and I am thinking there is a better way for this.
The name of the library also affects linking when there are other Haskell packages dependent on this one, which would make it even more convenient to specify/override the name.
I am using stack to drive cabal, if that is relevant.

Related

How to Use Haskell's Stack Build Tool to Export a Library to Be Consumed by C/C++?

Suppose one is using the stackbuild tool to make a Haskell library (importing packages from Hackage, and so forth) to be used with a C/C++ project in which main is located in C/C++.
Supposing your project is named Lib.hs (which uses external libraries from hackage), is there a way to use stack to export your Lib.o, Lib.hi, and Lib_stub.h to be consumed by a C/C++ compiler like gcc or g++?
EDIT: A related question might be: "how can one use Stack as a build tool to be used with a Haskell & C/C++ project in which main is located in C/C++?
EDIT2: Upon reflection, one way to solve this problem would be to use Stack as usual, but migrate your C/C++ main function to Haskell. Is this the best way to do it? Are there huge performance costs to this or anything I should be aware of?
Stack can't really do this on its own.
There's support for generating so called "foreign libraries" added to Cabal, but it's not in a released version, yet. See commit 382143 This will produce a shared library that dynamically links against the dynamic versions of each Haskell package used.
You can build your package with stack and then after the fact you can assemble a single native library. In the Galua project we do this with a custom Setup.hs and a separate linking script.
The result of this linking process is that you get a standalone statically linked library suitable for inclusion in a C project: libgalua.a.
Do note that for creating standalone libraries on Linux suitable for being linked into a shared library that you'll need to recompile GHC to generate PIC static libraries (macOS does this by default).

How to detect missing symbols in shared library with libtool

As stated, I want to be able to check that a shared library, created by libtool, is not missing any symbols,
I have written a library that is built as a shared library, 'A'. It depends in turn on another library 'B'.
The other library 'B' does not follow strict semver, and so sometimes introduces new functions in minor or patch releases.
Although I try to put appropriate #if B_LIB_VERSION >= 42 in the code for my library to not attempt to call a function in library B if it is not going to be available, apparently I sometimes get the version incorrect. This causes an error when the program is run.
Is it possible with libtool, or any other tool, to ask it to produce a list of all the symbols that are not found in a shared library, or any of the libraries that it will load?
As stated, I want to be able to check that a shared library, created by libtool, is not missing any symbols,
That's hard to do with shared libraries, as they are designed to allow for late symbol resolution. If you're not using dlopen type features, you might be able to build static executables from static versions of A and B and look for missing symbols.
The other library 'B' does not follow strict semver, and so sometimes introduces new functions in minor or patch releases.
I'd seriously consider searching for a replacement library, than having to keep on dealing with their dependency issues.
Is it possible with libtool, or any other tool, to ask it to produce a list of all the symbols that are not found in a shared library, or any of the libraries that it will load?
No, not really. nm will give you a list of symbols that are undefined (and referenced) in a shared library. objdump might be of some use also. On linux, ldd might do some of what you want. But generally there is no way of knowing exactly what a shared library loads, even without considering dlopen.
libltdl might be of some use also if you have to stick with the misbehaving library. At least you can figure out at runtime if libB.42 has symbol xyz or not. It's not as easy as the conditional code way of doing things.

How do I statically compile a C library into a Haskell module that I can later load with the GHC API?

Here is my desired use case:
I have a package with a single module that reads HDF5 files and writes some of their data to Haskell records. To do the work, the library uses the bindings-hdf5 package. Here is my cabal's build-depends. reader-types is a module I wrote that defines the types of the Haskell records that contain the read-in data.
build-depends: base >=4.7 && <4.8
, text
, vector
, containers
, bindings-hdf5
, reader-types
Note that my cabal file does not currently use extra-libraries or ghc-options. I can load my module, src/Mabel.hs in ghci as long as I specify the required hdf5_hl library:
ghci src/Mabel.hs -lhdf5_hl -L/long/nixos/path/lib
and within ghci, I can run my function perfectly fine.
Now, what I want to do is compile this library/module into a single, compiled file that I can later load with the GHC API in a different Haskell program. By single file, I mean that it needs to run even if the hdf5_hl library does not exist on the system. Preferably, it would also run even if text, vector, and/or containers are missing, but this is not essential because reader-types requires those types anyway. When loading the module with the GHC API, I want it to load in already compiled form, and not run interpreted.
My purpose for doing this is that I want the self-contained file to act as a single, pre-compiled plugin file that is later loaded and executed by a different Haskell executable. Other plugins might not use hdf5 at all, and the only package they are guaranteed to use is reader-types, which essentially defines the plugin interface types.
The hdf5 library on my system contains the following files: libhdf5_la.la, libhdf5_hl.so, libhdf5.la, libhdf5.so, and similar files that have the version number in the file name.
I have done a lot of googling, but am getting confused by all the edge cases I am finding. Here are some examples that I'm either sure don't fit my case, or I can't tell.
I do not want to compile a Haskell library to use from C or Python, only a Haskell program using GHC API.
I do not want to compile C wrappers for a C++ library into a Haskell module because the bindings already exist and the library is already a C library.
I do not to want compile a library that is entirely self-contained because, since I am loading it with the GHC API, I don't need the GHC runtime included in the library. (My understanding is that the plugins must be compiled with the same ghc version they will be loaded with in the GHC API).
I do not want to compile C bindings and the C library at the same time because the C library is already compiled and the bindings are specified in separate package (bindings-hdf5).
The closest resource for what I want to do is this exchange on the mailing list from 2009. However, I added extra-libraries: hdf5_hl or extra-libraries: hdf5 to my cabal file, and in both cases the resulting .a, .so, .dyn_hi, .dyn_o, .hi, and .o files in dist/build are all the exact same size as without using extra-libraries, so I'm confident it is not working correctly.
What changes to my cabal file do I need to make to create a self-contained, standalone file that I can later load with the GHC API? If this is not possible, what are the alternatives?
Instead of using the GHC API, I am also open to using the plugins library to load the plugin, but the self-contained requirements are still the same.
EDIT: I do not care what form the compiled "plugin" must take (I assume object file is the right way), but I want to load it dynamically from an separate executable at run time and execute functions it defines with known names and known types. The reason I want a single file is that there will eventually be other different plugins, and I want them all to behave the same way without having to worry about lib paths and dependencies for each one. A compiled, single file is a simpler interface for doing this than zipping/unzipping archives that include Haskell object code and their dependencies.

Loading Linux libraries at runtime

I think a major design flaw in Linux is the shared object hell when it comes to distributing programs in binary instead of source code form.
Here is my specific problem: I want to publish a Linux program in ELF binary form that should run on as many distributions as possible so my mandatory dependencies are as low as it gets: The only libraries required under any circumstances are libpthread, libX11, librt and libm (and glibc of course). I'm linking dynamically against these libraries when I build my program using gcc.
Optionally, however, my program should also support ALSA (sound interface), the Xcursor, Xfixes, and Xxf86vm extensions as well as GTK. But these should only be used if they are available on the user's system, otherwise my program should still run but with limited functionality. For example, if GTK isn't there, my program will fall back to terminal mode. Because my program should still be able to run without ALSA, Xcursor, Xfixes, etc. I cannot link dynamically against these libraries because then the program won't start at all if one of the libraries isn't there.
So I need to manually check if the libraries are present and then open them one by one using dlopen() and import the necessary function symbols using dlsym(). This, however, leads to all kinds of problems:
1) Library naming conventions:
Shared objects often aren't simply called "libXcursor.so" but have some kind of version extension like "libXcursor.so.1" or even really funny things like "libXcursor.so.0.2000". These extensions seem to differ from system to system. So which one should I choose when calling dlopen()? Using a hardcoded name here seems like a very bad idea because the names differ from system to system. So the only workaround that comes to my mind is to scan the whole library path and look for filenames starting with a "libXcursor.so" prefix and then do some custom version matching. But how do I know that they are really compatible?
2) Library search paths: Where should I look for the *.so files after all? This is also different from system to system. There are some default paths like /usr/lib and /lib but *.so files could also be in lots of other paths. So I'd have to open /etc/ld.so.conf and parse this to find out all library search paths. That's not a trivial thing to do because /etc/ld.so.conf files can also use some kind of include directive which means that I have to parse even more .conf files, do some checks against possible infinite loops caused by circular include directives etc. Is there really no easier way to find out the search paths for *.so?
So, my actual question is this: Isn't there a more convenient, less hackish way of achieving what I want to do? Is it really so complicated to create a Linux program that has some optional dependencies like ALSA, GTK, libXcursor... but should also work without it! Is there some kind of standard for doing what I want to do? Or am I doomed to do it the hackish way?
Thanks for your comments/solutions!
I think a major design flaw in Linux is the shared object hell when it comes to distributing programs in binary instead of source code form.
This isn't a design flaw as far as creators of the system are concerned; it's an advantage -- it encourages you to distribute programs in source form. Oh, you wanted to sell your software? Sorry, that's not the use case Linux is optimized for.
Library naming conventions: Shared objects often aren't simply called "libXcursor.so" but have some kind of version extension like "libXcursor.so.1" or even really funny things like "libXcursor.so.0.2000".
Yes, this is called external library versioning. Read about it here. As should be clear from that description, if you compiled your binaries using headers on a system that would normally give you libXcursor.so.1 as a runtime reference, then the only shared library you are compatible with is libXcursor.so.1, and trying to dlopen libXcursor.so.0.2000 will lead to unpredictable crashes.
Any system that provides libXcursor.so but not libXcursor.so.1 is either a broken installation, or is also incompatible with your binaries.
Library search paths: Where should I look for the *.so files after all?
You shouldn't be trying to dlopen any of these libraries using their full path. Just call dlopen("libXcursor.so.1", RTLD_GLOBAL);, and the runtime loader will search for the library in system-appropriate locations.

Finding the shared library name to use with dlload

In my open-source project Artha I use libnotify for showing passive desktop notifications to the user.
Instead of statically linking libnotify, a lookup at runtime is made for the shared object (.so) file via dlload, if available on the target machine, Artha exposes the notification feature in it's GUI. On app. start, a call to dlload with filename param as libnotify.so.1 is made and if it returns a non-null pointer, then the feature is exposed.
A recurring problem with this model is that every time the version number of the library is bumped, Artha's code needs to be updated, currently libnotify.so.4 is the latest to entail such an occurance.
Is there a linux system call (irrespective of the distro the app. is running on), which can tell me if a particular library's shared object is available at runtime? I know that there exists the bruteforce option of enumerating the library by going from 1 to say 10, I find the solution ugly and inelegant.
Also, if this can be addressed via autoconf, then that solution is welcome too I.e. at build time, based on the target machine, the configure.h generated should've the right .so name that can be passed to dlload.
P.S.: I think good distros follow the style of creating links to libnotify.so.x so that a programmer can just do dlload("libnotify.so", RTLD_LAZY) and the right version numbered .so is loaded; unfortunately not all distros follow this, including Ubuntu.
The answer is: you don't.
dlopen() is not designed to deal with things like that, and trying to load whichever soversion you find on the system just because it happens to have the symbols you need is not a good way to do it.
Different sonames have different ABIs, and different ABIs means that you may be calling the same exact symbol name that is expecting a different set (or different size) of parameters, which will cause crashes or misbehaviour that are extremely difficult do debug.
You should have a read on how shared object versions work and what an ABI is.
The libfoo.so link is there for the link editor (ld) and is usually installed with the -devel packages for that reason; it might also very well not be a link but rather a text file with a linker script, often times on purpose to avoid exactly what you're trying to do.

Resources