How to Use Haskell's Stack Build Tool to Export a Library to Be Consumed by C/C++? - haskell

Suppose one is using the stackbuild tool to make a Haskell library (importing packages from Hackage, and so forth) to be used with a C/C++ project in which main is located in C/C++.
Supposing your project is named Lib.hs (which uses external libraries from hackage), is there a way to use stack to export your Lib.o, Lib.hi, and Lib_stub.h to be consumed by a C/C++ compiler like gcc or g++?
EDIT: A related question might be: "how can one use Stack as a build tool to be used with a Haskell & C/C++ project in which main is located in C/C++?
EDIT2: Upon reflection, one way to solve this problem would be to use Stack as usual, but migrate your C/C++ main function to Haskell. Is this the best way to do it? Are there huge performance costs to this or anything I should be aware of?

Stack can't really do this on its own.
There's support for generating so called "foreign libraries" added to Cabal, but it's not in a released version, yet. See commit 382143 This will produce a shared library that dynamically links against the dynamic versions of each Haskell package used.
You can build your package with stack and then after the fact you can assemble a single native library. In the Galua project we do this with a custom Setup.hs and a separate linking script.
The result of this linking process is that you get a standalone statically linked library suitable for inclusion in a C project: libgalua.a.
Do note that for creating standalone libraries on Linux suitable for being linked into a shared library that you'll need to recompile GHC to generate PIC static libraries (macOS does this by default).

Related

Avoid dynamic linking in dependencies

I am developing a project against a custom linux and I am having troubles with dynamic dlls that are referenced by dependencies.
Is there a way to know if a dependency has dynamic linked libraries before hand? Is it possible to somehow avoid those libraries? I want to have a static binary (MUSL didn’t work for me as one dependency doesn’t compile with it).
Thanks
If you're compiling against glibc, you'll need to have at least some dynamic linking. While it is possible to statically link glibc, that isn't a supported configuration since the name service switch won't work in such a case.
In general, you should expect a build-dependency on cc or pkg-config to be an indicator of the use of a C or C++ library. That isn't a guarantee either way, but it is probably going to be the case the vast majority of the time. Some of those libraries will be able to be linked statically, but of course if you do that you must recompile your code every time any of your dependencies has a security update or you'll have a vulnerability. There's unfortunately no clear way to tell whether static linking is an option in such a case other than looking at the build.rs or the documentation of the crate.

How to generate library with a specific name via cabal

I am trying to build a shared Haskell library that is used by a C project afterwards. I am on a linux platform so my question is from that context.
Suppose I have a haskell package foo with a library named foo, say version 0.1 which exports some functions via ffi.
I can easily generate a shared library (.so) that I can then link with, but my issue is that the generated library is named libHSfoo-0.1-$COMPONENT_ID.so which makes it quite cumbersome to link with since $COMPONENT_ID is unpredictable as far as I can tell.
The $COMPONENT_ID comes, to the best of my knowledge from the following Cabal structure and it looks like I could write cabal hooks to at least copy the generated shared library, or create a symbolic link to it from a fixed location.
I am wondering whether there is a better way to specify the component-id to get an easily predictable name of the shared library without post-processing?
It seems like I can achieve this if in the configure hook I set the configArgs to just the library component, and the configCID to my desired name of the library, but that seems like a fragile solution and I am thinking there is a better way for this.
The name of the library also affects linking when there are other Haskell packages dependent on this one, which would make it even more convenient to specify/override the name.
I am using stack to drive cabal, if that is relevant.

Installing package from source on an initial ram filesystem

I'm trying to install multiple packages into an initial ram file system. I'm using uclibc as my C library. This could be a stupid question but...
Would the compiled program also need a C library installed onto the initramfs?
Am I right in thinking that when a program is compiled from source, it is compiled into some sort of executable? Will the application on the initramfs be ready to run once I have make installed (with the correct prefix and providing dependencies are met )?
Whether a compiled program needs a C library - or any kind of library, for that matter - depends on how it was linked.
In general, if your program was linked statically then it does not have any external dependencies - it only needs a working kernel. The executable code of any library that it depends on will have been incorporated into the final executable.
If, on the other hand, it is linked dynamically, then it still needs the shared object files of the libraries it depends on. On Linux, most library shared objects (also known as shared libraries) follow the convention of having a filename with either a .so extension or, in general, a *.so.* format. For example /lib/libssl3.so and /lib/libncurses.so.5.9 are both shared libraries on my system.
It is also possible to have an executable that is statically linked against some libraries and dynamically linked against others. A common case where this happens is when rare or proprietary libraries are linked in statically, while standard system libraries are linked in dynamically.

What are the pro and cons of statically linking a library?

I want to release an application I developed as a hobby both for Linux and Windows. This application depends on boost (and possibly other libraries). The norm for this kind of application (a chess engine) is to provide only an executable file and possibly some helper files.
I tough it would be a good idea to statically link the libraries so the executable would not have any dependencies. So the end user can just put the executable in a directory and start using it.
However, while doing some research online I found some negative comments about statically linking libraries, some even arguing that an application with statically linked libraries would be hardly portable, meaning that it would only run on my system of highly similar systems.
So what are the pros and cons of statically linking library?
I already know that the executable will be bigger. But I can't see why it would make my application less portable.
Pros:
No dependencies.
Cons:
Higher memory usage, as the OS can no longer use a shared copy of the library.
If the library needs to be updated, your application needs to be rebuilt. This is doubly important for libraries that then have security fixes.
Of course, a bigger issue for portability is the lack of source code distribution.
Let's say the static library "A" you include has a dependency on function "B". If this dependency can't be fulfilled by the target system, then your program won't run.
But if you're using dynamic linking, the user could maybe install another version of library "A" that uses function "C" instead of "B", so it can run successfully.
If you link the libraries statically, unless you add the smarts to also check the user's system for the libraries you've linked, you're locking your application to use those versions of the libraries until you update your executable. Security holes happen, and updates happen. (For a chess engine there may not be too much issue, but who knows.)
With dynamically linked libraries, if the library say X, you have linked with is not available at the user system, your code crashes ungracefully leaving the end user wondering.
Whereas, in the case of static libraries everything is fused into the executable, so a condition like above mayn't happen, the executable however will be very bulky.
The above problem in dynamically linked libraries can however, be eliminated by dynamic loading.

How to create a shared object that is statically linked with pthreads and libstdc++ on Linux/gcc?

How to create a shared object that is statically linked with pthreads and libstdc++ on Linux/gcc?
Before I go to answering your question as it was described, I will note that it is not exactly clear what you are trying to achieve in the end, and there is probably a better solution to your problem.
That said - there are two main problems with trying to do what you described:
One is, that you will need to decompose libpthread and libstdc++ to the object files they are made with. This is because ELF binaries (used on Linux) have two levels of "run time" library loading - even when an executable is statically linked, the loader has to load the statically linked libraries within the binary on execution, and map the right memory addresses. This is done before the shared linkage of libraries that are dynamically loaded (shared objects) and mapped to shared memory. Thus, a shared object cannot be statically linked with such libraries, as at the time the object is loaded, all static linked libraries were loaded already. This is one difference between linking with a static library and a plain object file - a static library is not merely glued like any object file into the executable, but still contains separate tables which are referred to on loading. (I believe that this is in contrast to the much simpler static libraries in MS-DOS and classic Windows, .LIB files, but there may be more to those than I remember).
Of course you do not actually have to decompose libpthread and libstdc++, you can just use the object files generated when building them. Collecting them may be a bit difficult though (look for the objects referred to by the final Makefile rule of those libraries). And you would have to use ld directly and not gcc/g++ to link, to avoid linking with the dynamic versions as well.
The second problem is consequential. If you do the above, you will sure have such a shared object / dynamic library as you asked to build. However, it will not be very useful, as once you try to link a regular executable that uses those libpthread/libstdc++ (the latter being any C++ program) with this shared object, it will fail with symbol conflicts - the symbols of the static libpthread/libstdc++ objects you linked your shared object against will clash with the symbols from the standard libpthread/libstdc++ used by that executable, no matter if it is dynamically or statically linked with the standard libraries.
You could of course then try to either hide all symbols in the static objects from libstdc++/libpthread used by your shared library, make them private in some way, or rename them automatically on linkage so that there will be no conflict. However, even if you get that to work, you will find some undesireable results in runtime, since both libstdc++/libpthread keep quite a bit of state in global variables and structures, which you would now have duplicate and each unaware of the other. This will lead to inconsistencies between these global data and the underlying operating system state such as file descriptors and memory bounds (and perhaps some values from the standard C library such as errno for libstdc++, and signal handlers and timers for libpthread.
To avoid over-broad interpretation, I will add a remark: at times there can be sensible grounds for wanting to statically link against even such basic libraries as libstdc++ and even libc, and even though it is becoming a bit more difficult with recent systems and versions of those libraries (due to a bit of coupling with the loader and special linker tricks used), it is definitely possible - I did it a few times, and know of other cases in which it is still done. However, in that case you need to link a whole executable statically. Static linkage with standard libraries combined with dynamic linkage with other objects is not normally feasible.
Edit: One issue which I forgot to mention but is important to take into account is C++ specific. C++ was unfortunately not designed to work well with the classic model of object linkage and loading (used on Unix and other systems). This makes shared libraries in C++ not really portable as they should be, because a lot of things such as type information and templates are not cleanly separated between objects (often being taken, together with a lot of actual library code at compile time from the headers). libstdc++ for that reason is tightly coupled with GCC, and code compiled with one version of g++ will in general only work with the libstdc++ from with this (or a very similar) version of g++. As you will surely notice if you ever try to build a program with GCC 4 with any non-trivial library on your system that was built with GCC 3, this is not just libstdc++. If your reason for wanting to do that is trying to ensure that your shared object is always linked with the specific versions of libstdc++ and libpthread that it was built against, this would not help because a program that uses a different/incompatible libstdc++ would also be built with an incompatible C++ compiler or version of g++, and would thus fail to link with your shared object anyway, aside from the actual libstdc++ conflicts.
If you wonder "why wasn't this done simpler?", a general rumination worth pondering: For C++ to work nicely with dynamic/shared libraries (meaning compatibility across compilers, and the ability to replace a dynamic library with another version with a compatible interface without rebuilding everything that uses it), not just compiler standartization is needed, but at the level of the operating system's loader, the structure and interface of object and library files and the work of the linker would need to be significantly extended beyond the relatively simple Unix classics used on common operating systems (Microsoft Windows, Mach based systems and NeXTStep relatives such as Mac OS, VMS relatives and some mainframe systems also included) for natively built code today. The linker and dynamic loader would need to be aware of such things as templates and typing, having to some extent functionality of a small compiler to actually adapt the library's code to the type given to it - and (personal subjective observation here) it seems that higher-level intermediate intermediate code (together with higher-level languages and just-in-time compilation) is catching ground faster and likely to be standardized sooner than such extensions to the native object formats and linkers.
You mentioned in a separate comment that you are trying to port a C++ library to an embedded device. (I am adding a new answer here instead of editing my original answer here because I think other StackOverflow users interested in this original question may still be interested in that answer in its context)
Obviously, depending on how stripped down your embedded system is (I have not much embedded Linux experience, so I am not sure what is most likely), you may of course be able to just install the shared libstdc++ on it and dynamically link everything as you would do otherwise.
If dynamically linking with libstdc++ would not be good for you or not work on your system (there are so many different levels of embedded systems that one cannot know), and you need to link against a static libstdc++, then as I said, your only real option is static linking the executable using the library with it and libstdc++. You mentioned porting a library to the embedded device, but if this is for the purpose of using it in some code you write or build on the device and you do not mind a static libstdc++, then linking everything statically (aside from perhaps libc) is probably OK.
If the size of libstdc++ is a problem, and you find that your library is actually only using a small part of its interfaces, then I would nonetheless suggest first trying to determine the actual space you would save by linking against only the parts you need. It may be significant or not, I never looked that deep into libstdc++ and I suspect that it has a lot of internal dependencies, so while you surely do not need some of the interfaces, you may or may not still depend on a big part of its internals - I do not know and did not try, but it may surprise you. You can get an idea by just linking a binary using the library against a static build of it and libstdc++ (not forgetting to strip the binary, of course), and comparing the size of the resulting executable that with the total size of a (stripped) executable dynamically linked together with the full (stripped) shared objects of the library and libstdc++.
If you find that the size difference is significant, but do not want to statically link everything, you try to reduce the size of libstdc++ by rebuilding it without some parts you know that you do not need (there are configure-time options for some parts of it, and you can also try to remove some independent objects at the final creation of libstdc++.so. There are some tools to optimize the size of libraries - search the web (I recall one from a company named MontaVista but do not see it on their web site now, there are some others too).
Other than the straightforward above, some ideas and suggestions to think of:
You mentioned that you use uClibc, which I never fiddled with myself (my experience with embedded programming is a lot more primitive, mostly involving assembly programming for the embedded processor and cross-compiling with minimal embedded libraries). I assume you checked this, and I know that uClibc is intended to be a lightweight but rather full standard C library, but do not forget that C++ code is hardly independent on the C library, and g++ and libstdc++ depend on quite some delicate things (I remember problems with libc on some proprietary Unix versions), so I would not just assume that g++ or the GNU libstdc++ actually works with uClibc without trying - I don't recall seeing it mentioned in the uClibc pages.
Also, if this is an embedded system, think of its performance, compute power, overall complexity, and timing/simplicity/solidity requirements. Take into consideration the complexity involved, and think whether using C++ and threads is appropriate in your embedded system, and if nothing else in the system uses those, whether it is worth introducing for that library. It may be, not knowing the library or system I cannot tell (again, embedded systems being such a wide range nowadays).
And in this case also, just a quick link I stumbled upon looking for uClibc -- if you are working on an embedded system, using uClibc, and want to use C++ code on it -- take a look at uClibc++. I do not know how much of the standard C++ stuff you need and it already supports, and it seems to be an ongoing project, so not clear if it is in a state good enough for you already, but assuming that your work is also under development still, it might be a good alternative to GCC's libstdc++ for your embedded work.
I think this guy explains quite well why that wouldn't make sense. C++ code that uses your shared object but a different libstdc++ would link alright, but wouldn't work.

Resources