Using GHC with NVCC - haskell

As an alternative to accelerate, I'm trying to call CUDA code over Haskell's FFI.
Here's a simple program that fails to compile:
cuda_code.cu:
void cuda_init() {
cudaFree (0);
cudaThreadSynchronize ();
}
Test.hs:
foreign import ccall unsafe "cuda_init" cuda_init :: IO ()
main = cuda_init
I compiled with
$> nvcc -c -o cuda_code.o cuda_code.cu
$> ghc Test cuda_code.o
and got several linking errors (undefined reference to cudaFree, etc). This isn't terribly surprising, and the obvious solution (to me) is to link with NVCC using -pgml nvcc. (This worked when I was using Intel CILK+ in my C code: I simply changed the linker to ICC, and everything worked just fine.)
Howver, using NVCC to link results in the linking error:
ghc Test -pgml nvcc cuda_code.o
[1 of 1] Compiling Main ( Test.hs, Test.o )
Linking Test ...
nvcc fatal : Unknown option 'u'
Running
strace -v -f -e execve ghc Test -pgml nvcc cuda_code.o
(is there an easier way?) I discovered ghc is calling nvcc with
nvcc ... -L~/ghc... -L... -l... -l... -u ghczmprim_GHC... -u ghc...
I assume the -u options are intended for linking gcc (and apparently icc) with undefined symbols, something nvcc clearly doesn't like.
I have no knowledge about how GHC links files. Thoughts on how I can get GHC to link to my CUDA code?
--------EDIT-----------------
Someone suggested that I try linking with GCC (as usual), but pass in the necessary linker options to gcc so that it can link to the CUDA libraries. If anyone knows what these might be, this would probably work!

GHC uses /usr/lib/ghc/settings to determine compiler and linker options, and per-package files like /var/lib/ghc/package.conf.d/builtin_rts.conf to determine package-specific linker options. (Custom directory installation will have them in ${GHC}/lib/ghc-${VERSION}/settings and ${GHC}/lib/ghc-${VERSION}/package.conf.d respectively.)
Here is what I found for the RTS:
ld-options: -u ghczmprim_GHCziTypes_Izh_static_info -u
ghczmprim_GHCziTypes_Czh_static_info -u
ghczmprim_GHCziTypes_Fzh_static_info -u
ghczmprim_GHCziTypes_Dzh_static_info
...
According to the ld man page, the -u option defines a symbol as an undefined extern that must be defined somewhere else.
As far as I know this is the ONLY package that has these custom -u options in the ld-options: section of package.conf.d.
These must be unfortunately translated for a compiler/linker that uses a different option interface.
Be so kind and keep people posted about it on haskell-cafe#haskell.org. I'm sure there are others trying something like this too!

I figured out how to make this work.
cudaTest.cu:
// the `extern "C"` is important! It tells nvcc to not
// mangle the name, since nvcc assumes C++ code by default
extern "C"
void cudafunc() {
cudaFree(0);
cudaThreadSynchronize();
}
Test.hs
foreign import ccall unsafe "cudafunc" cudaFunc :: IO ()
main = cudaFunc
Compile with:
>nvcc -c -o cudaTest.o cudaTest.cu
>ghc --make Test.hs -o Test cudaTest.o -optl-lcudart
I also tried giving GHC the option -pgmc g++ and removing the extern "C" (which I expected to work), but got compile errors in some CUDA header files. There's probably some easy way to fix this so that you don't need to explicitly tag every function with extern "C".

Related

Haskell FFI: stack run is ok, but GHCi does not link properly

I am trying to learn how to structure a Haskell project/workflow that uses FFI.
I am using stack, but I find myself unable to use GHCi when it comes to the imported foreign functions.
Here is a simplified version of the problem. Let's say that I have the following two files in $PROJECT_ROOT/cbits:
hello.h
#ifndef HELLO_H
#define HELLO_H
extern "C"
{
int foo();
}
#endif /* HELLO_H */
hello.cpp
#include "hello.h"
#include <iostream>
int foo()
{
std::cout << "extremely dangerous side effect" << std::endl;
return 42;
}
My Main.hs file:
module Main where
import Foreign.C
foreign import ccall unsafe "foo" foo :: IO CInt
-- this does side effects and prints '42'
main = foo >>= print
The relevant (C++ specific) section of my package.yaml is:
include-dirs:
- cbits
cxx-sources:
- cbits/*.cpp
cxx-options:
- -std=c++17
extra-libraries:
- stdc++
I am using the souffle-haskell's package.yaml as a reference.
Compiling and running with stack run is ok and I get the expected output:
extremely dangerous side effect
42
But, in the GHCi session (run with stack ghci), calling main gives:
ghc: ^^ Could not load 'foo', dependency unresolved. See top entry above.
GHC.ByteCode.Linker: can't find label
During interactive linking, GHCi couldn't find the following symbol:
foo
This may be due to you not asking GHCi to load extra object files,
archives or DLLs needed by your current session. Restart GHCi, specifying
the missing library using the -L/path/to/object/dir and -lmissinglibname
flags, or simply by naming the relevant files on the GHCi command line.
Alternatively, this link failure might indicate a bug in GHCi.
If you suspect the latter, please report this as a GHC bug:
https://www.haskell.org/ghc/reportabug
The problem is not present if I compile hello.cpp beforehand:
g++ -c cbits/hello.cpp -o cbits/hello.o
And then run stack ghci --ghci-options cbits/hello.o, as suggested by the GHCi error message.
Question is: do I really need to maintain a separate *.o file specifically for GHCi? Searching online I have found discussions addressing only the GHCi part or the stack/cabal part, but not both. The only useful answer that I have found is this one from 2013, which reaffirms the "solution" given by GHCi and does not mention stack or cabal.
Question is: do I really need to maintain a separate *.o file specifically for GHCi?
Answer is: no.
After several tries, the only thing that I had to change was the name of an option:
- cxx-sources:
+ c-sources:
This left the behaviour of stack run unchanged, and allowed GHCi to link properly to the compiled code.

Clang linker finding some symbols but not others

In my .nim code, I'm using the header pragma to include symbols from /usr/local/include/node/node_api.h (which then includes /usr/local/include/node/js_native_api.h).
proc napi_create_function(
env: napi_env,
utf8name: cstring,
length: csize_t,
cb: napi_callback,
data: pointer,
res: napi_value
): int {.header:"<node/node_api.h>".}
When I run nim c foo.nim, I get Undefined symbols for architecture x86_64 for symbols in js_native_api.h (eg: napi_create_function), but the symbols in node_api.h are found by the linker. Remember that node_api.h includes js_native_api.h (as seen here).
Undefined symbols for architecture x86_64:
"_napi_create_function", referenced from:
_createFn__NEWhgHCwqbksHULYRnxXfA in #m..#s..#s..#s.nimble#spkgs#snapibindings-0.1.0#snapibindings.nim.c.o
The root problem likely isn't related to Nim, but I don't know how to use clang to check if the problem is reproducible without Nim.
So my question is:
How do I get the linker to find the missing symbols?
Versions
nim v1.4.8
clang v12.0.0
x86_64-apple-darwin19.6.0
nodejs v14.13.1 (installed with Homebrew into /usr/local/Cellar/node/14.13.1)
nim c
/Users/alec/.nimble/bin/nim
c
--colors:on
--noNimblePath
-d:NimblePkgVersion=0.1.0
--path:/Users/alec/.nimble/pkgs/nimdbx-0.4.1
--path:/Users/alec/.nimble/pkgs/nimterop-0.6.13
--path:/Users/alec/.nimble/pkgs/regex-0.19.0
--path:/Users/alec/.nimble/pkgs/unicodedb-0.9.0
--path:/Users/alec/.nimble/pkgs/cligen-1.5.4
--path:/Users/alec/.nimble/pkgs/cbor-0.6.0
--path:/Users/alec/.nimble/pkgs/napibindings-0.1.0
--path:'/Users/alec/.nimble/pkgs/docopt-#master'
--path:/Users/alec/.nimble/pkgs/regex-0.19.0
--path:/Users/alec/.nimble/pkgs/unicodedb-0.9.0
--hints:off
-o:/Users/alec/my-project/dist/foo
/Users/alec/my-project/foo.nim
clang
clang
-o
/Users/alec/my-project/foo
/Users/alec/.cache/nim/foo_d/stdlib_assertions.nim.c.o
/Users/alec/.cache/nim/foo_d/stdlib_dollars.nim.c.o
/Users/alec/.cache/nim/foo_d/stdlib_formatfloat.nim.c.o
/Users/alec/.cache/nim/foo_d/stdlib_io.nim.c.o
/Users/alec/.cache/nim/foo_d/stdlib_system.nim.c.o
/Users/alec/.cache/nim/foo_d/#m..#s..#s..#s.nimble#spkgs#snapibindings-0.1.0#snapibindings#sutils.nim.c.o
/Users/alec/.cache/nim/foo_d/#m..#s..#s..#s.nimble#spkgs#snapibindings-0.1.0#snapibindings.nim.c.o
/Users/alec/.cache/nim/foo_d/stdlib_parseutils.nim.c.o
/Users/alec/.cache/nim/foo_d/stdlib_math.nim.c.o
/Users/alec/.cache/nim/foo_d/stdlib_unicode.nim.c.o
/Users/alec/.cache/nim/foo_d/stdlib_strutils.nim.c.o
/Users/alec/.cache/nim/foo_d/stdlib_posix.nim.c.o
/Users/alec/.cache/nim/foo_d/stdlib_options.nim.c.o
/Users/alec/.cache/nim/foo_d/stdlib_times.nim.c.o
/Users/alec/.cache/nim/foo_d/stdlib_os.nim.c.o
/Users/alec/.cache/nim/foo_d/stdlib_hashes.nim.c.o
/Users/alec/.cache/nim/foo_d/stdlib_tables.nim.c.o
/Users/alec/.cache/nim/foo_d/#m..#s..#s..#s.nimble#spkgs#snimterop-0.6.13#snimterop#sglobals.nim.c.o
/Users/alec/.cache/nim/foo_d/stdlib_streams.nim.c.o
/Users/alec/.cache/nim/foo_d/stdlib_lexbase.nim.c.o
/Users/alec/.cache/nim/foo_d/stdlib_parsejson.nim.c.o
/Users/alec/.cache/nim/foo_d/stdlib_json.nim.c.o
/Users/alec/.cache/nim/foo_d/stdlib_cpuinfo.nim.c.o
/Users/alec/.cache/nim/foo_d/#m..#s..#s..#s.nimble#spkgs#snimdbx-0.4.1#snimdbx#sprivate#slibmdbx.nim.c.o
/Users/alec/.cache/nim/foo_d/#m..#s..#s..#s.nimble#spkgs#snimdbx-0.4.1#snimdbx#sprivate#svals.nim.c.o
/Users/alec/.cache/nim/foo_d/#m..#s..#s..#s.nimble#spkgs#snimdbx-0.4.1#snimdbx#sError.nim.c.o
/Users/alec/.cache/nim/foo_d/#m..#s..#s..#s.nimble#spkgs#snimdbx-0.4.1#snimdbx#sDatabase.nim.c.o
/Users/alec/.cache/nim/foo_d/#m..#s..#s..#s.nimble#spkgs#snimdbx-0.4.1#snimdbx#sData.nim.c.o
/Users/alec/.cache/nim/foo_d/#m..#s..#s..#s.nimble#spkgs#snimdbx-0.4.1#snimdbx#sCollection.nim.c.o
/Users/alec/.cache/nim/foo_d/#m..#s..#s..#s.nimble#spkgs#snimdbx-0.4.1#snimdbx#sTransaction.nim.c.o
/Users/alec/.cache/nim/foo_d/#m..#s..#s..#s.nimble#spkgs#snimdbx-0.4.1#snimdbx#sCRUD.nim.c.o
/Users/alec/.cache/nim/foo_d/#mfoopkg#sdata.nim.c.o
/Users/alec/.cache/nim/foo_d/#mfoopkg#sdata#sfrom_json.nim.c.o
/Users/alec/.cache/nim/foo_d/#mfoopkg#sdata#sto_json.nim.c.o
/Users/alec/.cache/nim/foo_d/#mfoopkg#squery.nim.c.o
/Users/alec/.cache/nim/foo_d/#m..#s..#s..#s.nimble#spkgs#scbor-0.6.0#scbor.nim.c.o
/Users/alec/.cache/nim/foo_d/#mfoopkg#sdata#sfrom_cbor.nim.c.o
/Users/alec/.cache/nim/foo_d/#mfoopkg#sdata#sto_cbor.nim.c.o
/Users/alec/.cache/nim/foo_d/#mfoopkg#sref.nim.c.o
/Users/alec/.cache/nim/foo_d/#mfoopkg#squery#sdocument.nim.c.o
/Users/alec/.cache/nim/foo_d/#mfoopkg#sfunctions.nim.c.o
/Users/alec/.cache/nim/foo_d/#mfoopkg#seval.nim.c.o
/Users/alec/.cache/nim/foo_d/#mfoo.nim.c.o
-lm
/Users/alec/.nimble/pkgs/nimdbx-0.4.1/libmdbx-dist/libmdbx.a
-ldl
So my question is:
How do I get the linker to find the missing symbols?
make sure you are actually linking a library (statically or dynamically) which holds symbols you need. (please show how you link it.)
make sure your library actually HAS correct symbols (open file with hex editor and search for symbols)
make sure this library is of correct architecture. there are tools that let you check this. (on Windows it's dumpbin /headers file)
make sure you are importing it correctly. (i see only {.header.} pragma but others needed pragmas are absent). please show more code and command lines. so we can investigate further.

When linking a shared library on linux, are all modules included?

I'm porting a system of apps from AIX to linux, and all of those apps include a single shared library. I've got the shared library building on as a linux .so now - and I see at least one post here that describes how to specify what's exported from a shared library (as AIX does via a .exp file).
Just one silly question, though. On AIX, if a module in a shared library is not referenced by anything in the app that's linking to it, it is ignored by the linker. That doesn't seem to be the case on linux - but I want to make sure.
While testing my linux shared library, I left out one module with dependencies I wasn't ready to deal with yet (or more accurately, I provided a substitute module with dummy functions for all the entry points to that module, thinking that would allow it to link). So far, so good. But when I attempted to link that shared library into a trivial test app, the linker reported unresolved symbols for stuff referenced by another shared library module that is itself only referenced from within the module I replaced with dummies. I.e., I would have expeceted that module to simply be ignored...
In other words, this module is being considered by the linker as part of the final application even though nothing in the app references it. I tried the same experiment on AIX (replacing the same module with dummies and attempting to link a trivial app there). No complaints.
So, The AIX linker only attempts to resolve shared library module dependencies if those modules themselves are explicitly called in from the application. But the linux linker attempts to resolve dependencies for all shared library modules whether they're called in from the application or not.
Is this true? And if so, is there any way to override that behavior? Ultimately, when I port everything, all of the dependencies will resolve. But for now, it's hard to leave something out - even if it's not referenced...
Here's a minimal case:
main.c contains function main(), which calls function one().
one.c contains function one(), which does nothing.
two.c contains function two(), which calls function three().
There is no function three(), but libshared.so is built from
modules one.c and two.c. Program main is built from main.c and
links in libshared.so.
The linker needs to resolve function one(), which is in the shared
library. But that's all main.c requires. Still, function two() in
the library references function three(), which doesn't exist.
The linker will complain about the undefined symbol 'three', even
though program main doesn't need it.
On AIX the linker will not complain and everything will work.
main.c:
#include <stdio.h>
int one();
int main()
{
one();
}
one.c:
#include <stdio.h>
int one()
{
return 1;
}
two.c:
#include <stdio.h>
int three();
int two()
{
return three();
}
build libshared.so with modules one.c and two.c:
gcc -fPIC -shared one.c two.c -o libshared.so
Attempt to build main from main.c and libshared.so:
gcc main.c -o main -L. -lshared
./libshared.so: undefined reference to `three'
collect2: error: ld returned 1 exit status
The linker reports an undefined reference to 'three',
which is referenced from two() - but main() doesn't ever call two().
The actual answer: shared libraries are in fact shared objects: they are treated as a single object, not as a *.a library.
This shows that Linux (meaning: glibc/gcc/gold/ld) and AIX have different concepts regarding shared objects.
In Linux, when you link an executable, ld/gold checks the dependencies of the used shared objects as well -- Aix linker doesn't: it assumes that the shared objects are to be used as they are, their dependencies aren't part of the current linking. (At least this is the default behaviour.)
Here is a summary of my tests:
+----------------+--------------------+-------------------------------+
| | AIX | linux |
+----------------+--------------------+-------------------------------+
| libshared.so | only with option | yes |
| can be created | -Wl,-berok | |
+----------------+--------------------+-------------------------------+
| main | yes | only with option |
| can be created | | -Wl,--allow-shlib-undefined |
+----------------+--------------------+-------------------------------+
Note: My random thoughts regarding AIX and linking: http://lzsiga.users.sourceforge.net/aix-linking.html
By default the GNU binutils linker, ld on
Linux requires a symbol ref to be defined by some input file (i.e. object file or
shared library) in the linkage if ref is referenced by the definition of any
symbol def in any input file that the linkage needs. It doesn't matter whether def is referenced in turn.
Your program linkage needs libshared.so. libshared.so defines two, which refers to three,
so three must be defined.
You can countermand this default behaviour to tolerate undefined references in shared libraries
(but not in object files) as follows:
$ gcc main.c -o main -L. -lshared -Wl,--allow-shlib-undefined
--allow-shlib-undefined is documented in the ld manual
The notion of module in your language corresponds to translation unit at the
compilation level and object file at the linkage level. It might be helpful to
appreciate that an object file input to the linkage of a ELF program or shared library
has no distinct existence in the program or shared library. It is cut into
pieces and scattered around. So there is no sense in which it would be possible
for a linkage:
$ gcc main.c -o main -L. -lshared ...
to ignore the unreferenced module two.(c|o) within
libshared.so. There is no such thing. If that linkage did not need any
definition provided by libshared.so then it would ignore the shared library
altogether1. If it needs the shared library, then by default its references
must be resolved.
[1] That is, on Debian-clan systems where gcc is built to invoke ld with the --as-needed option
by default. On Redhat-clan systems GCC by default links shared libraries if they are input, needed or not.

How do I invoke the “--enable-stdcall-fixup” option?

While building a DLL under Windows I get the following output:
Linking main.exe ...
Warning: resolving _findPeaksWrapper by linking to _findPeaksWrapper#16
Use --enable-stdcall-fixup to disable these warnings
Creating library file: HSdll.dll.a
Use --disable-stdcall-fixup to disable these fixups
It’s not clear to me where I should be placing the --enable-stdcall-fixup flag. Putting it into the ghc-options field of my .cabal file gives a GHC error, while putting it into cc-options or ld-options seems not to do anything (the warnings are still displayed). Where should this flag go?
Googling indicates that --enable-stdcall-fixup is an option to ld. There are a few different pathways by which cabal's final link step can happen, but in your case it is apparently
Cabal -> ghc (link step) -> gcc -> ld
so to match this you must specify
ghc-options: -optl-Wl,--enable-stdcall-fixup

crt1.o: In function `_start': - undefined reference to `main' in Linux

I am porting an application from Solaris to Linux
The object files which are linked do not have a main() defined. But compilation and linking is done properly in Solaris and executable is generated. In Linux I get this error
/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../lib64/crt1.o: In function `_start':
(.text+0x20): undefined reference to `main'
My problem is, I cannot include new .c/.o files since its a huge application and has been running for years. How can I get rid of this error?
Code extractes of makefile:
RPCAPPN = api
LINK = cc
$(RPCAPPN)_server: $(RPCAPIOBJ)
$(LINK) -g $(RPCAPIOBJ) -o $(RPCAPPN)_server $(IDALIBS) $(LIBS) $(ORALIBS) $(COMMONLIB) $(LIBAPI) $(CCLIB) $(THREADLIB) $(DBSERVERLIB) $(ENCLIB)
Try adding -nostartfiles to your linker options, i.e.
$(LINK) -nostartfiles -g ...
From the gcc documentation:
-nostartfiles
Do not use the standard system startup files when linking. The standard system libraries are used normally, unless -nostdlib or -nodefaultlibs is used.
This causes crt1.o not to be linked (it's normally linked by default) - normally only used when you implement your own _start code.
-shared link option must be used when you compile a .so
The issue for me was, I by mistake put int main() in a namespace. Make sure don't do that otherwise you will get this annoying link error.
Hope this helps anyone :)
I had similar result when trying to build a new test project with boost, and it turned out that I was missing one declaration :
#define BOOST_TEST_MODULE <yourtestName>
I had this same problem when creating my c project, and I forgot to save my main.c file, so there was no main function.
I had a similar result when compiling a Fortran program that had C++ components linked in. In my case, CMake failed to detect that Fortran should be used for the final linking. The messages returned by make then ended with
[100%] Linking CXX executable myprogram
/lib/../lib64/crt1.o: In function `_start':
(.text+0x20): undefined reference to `main'
make[3]: *** [myprogram] Error 1
make[2]: *** [CMakeFiles/myprogram.dir/all] Error 2
make[1]: *** [CMakeFiles/myprogram.dir/rule] Error 2
make: *** [myprogram] Error 2
The solution was to add
set_target_properties(myprogram PROPERTIES LINKER_LANGUAGE Fortran)
to the CMakeLists.txt, so that make prints out:
[100%] Linking Fortran executable myprogram
[100%] Built target myprogram
I had the same issue with a large CMake project, after I moved some functions from one code file to another. I deleted the build folder, recreated it and rebuilt. Then it worked.
Generally, with suddenly appearing linker errors, try completely deleting your build folder and rebuilding first. That can save you the headaches from trying to hunt down an error that actually simply shouldn't be there: There might be CMake cache variables floating around that have the wrong values, or something was renamed and not deleted, ...
I had the same issue as to OP but on on FreeBSD 13.1.
What solved the issue was simply adding:
int main()
{
}
Since the .cpp file was only an object file containing definitions and declarations using:
extern "C"
{
<all definitions and declarations code goes here>
}
Every time I tried compiling this, the compiler kept throwing the same error as to OP.
So all I did was add an empty main() function all the way at the bottom and code compiled with no errors.

Resources