How could I write a MEX+CUDA code and compile? [duplicate] - linux

I'm trying to use CUDA code inside MATLAB mex, under linux. With the "whole program compilation" mode, it works good for me. I take the following two steps inside Nsight:
(1) Add "-fPIC" as a compiler option to each .cpp or .cu file, then compile them separately, each producing a .o file.
(2) Set the linker command to be "mex" and add "-cxx" to indicate that the type of all the .o input files are cpp files, and add the library path for cuda. Also add a cpp file that contains the mexFunction entry as an additional input.
This works good and the resulted mex file runs well under MATLAB. After that when I need to use dynamical parallelism, I have to switch to the "separate compilation mode" in Nsight. I tried the same thing above but the linker produces a lot of errors of missing reference, which I wasn't able to resolve.
Then I checked the compilation and linking steps of the "separate compilation" mode. I got confused by what it is doing. It seems that Nsight does two compilation steps for each .cpp or .cu file and produces a .o file as well as a .d file. Like this:
/usr/local/cuda-5.5/bin/nvcc -O3 -gencode arch=compute_35,code=sm_35 -odir "src" -M -o "src/tn_matrix.d" "../src/tn_matrix.cu"
/usr/local/cuda-5.5/bin/nvcc --device-c -O3 -gencode arch=compute_35,code=compute_35 -gencode arch=compute_35,code=sm_35 -x cu -o "src/tn_matrix.o" "../src/tn_matrix.cu"
The linking command is like this:
/usr/local/cuda-5.5/bin/nvcc --cudart static --relocatable-device-code=true -gencode arch=compute_35,code=compute_35 -gencode arch=compute_35,code=sm_35 -link -o "test7" ./src/cu_base.o ./src/exp_bp_wsj_dev_mex.o ./src/tn_main.o ./src/tn_matlab_helper.o ./src/tn_matrix.o ./src/tn_matrix_lib_dev.o ./src/tn_matrix_lib_host.o ./src/tn_model_wsj_dev.o ./src/tn_model_wsj_host.o ./src/tn_utility.o -lcudadevrt -lmx -lcusparse -lcurand -lcublas
What's interesting is that the linker does not take the .d file as input. So I'm not sure how it dealt with these files and how I should process them with the "mex" command when linking?
Another problem is that the linking stage has a lot of options I don't understand (--cudart static --relocatable-device-code=true), which I guess is the reason why I cannot make it work like in the "whole program compilation" mode. So I tried the following:
(1) Compile in the same way as in the beginning of the post.
(2) Preserve the linking command as provided by Nsight but change to use "-shared" option, so that the linker produces a lib file.
(3) Invoke mex with input the lib file and another cpp file containing the mexFunction entry.
This way mex compilation works and it produces a mex executable as output. However, running the resulted mex executable under MATLAB produces a segmentation fault immediately and crashes MATLAB.
I'm not sure if this way of linking would cause any problem. More strangely, I found that the mex linking step seems to finish trivially without even checking the completeness of the executable, because even if I miss a .cpp file for some function that the mexFunction will use, it still compiles.
EDIT:
I figured out how to manually link into a mex executable which can run correctly under MATLAB, but I haven't figured out how to do that automatically under Nsight, which I can in the "whole program compilation" mode. Here is my approach:
(1) Exclude from build the cpp file which contains the mexFunction entry. Manually compile it with the command "mex -c".
(2) Add "-fPIC" as a compiler option to each of the rest .cpp or .cu file, then compile them separately, each producing a .o file.
(3) Linking will fail because it cannot find the main function. We don't have it since we use mexFunction and it is excluded. This doesn't matter and I just leave it there.
(4) Follow the method in the post below to manually dlink the .o files into a device object file
cuda shared library linking: undefined reference to cudaRegisterLinkedBinary
For example, if step (2) produces a.o and b.o, here we do
nvcc -gencode arch=compute_35,code=sm_35 -Xcompiler '-fPIC' -dlink a.o b.o -o mex_dev.o -lcudadevrt
Note that here the output file mex_dev.o should not exist, otherwise the above command will fail.
(5) Use mex command to link all the .o files produced in step (2) and step (4), with all necessary libraries supplied.
This works and produces runnable mex executable. The reason I cannot automate step (1) inside Nsight is because if I change the compilation command to "mex", Nsight will also use this command to generate a dependency file (the .d file mentioned in the question text). And the reason I cannot automate step (4) and step (5) in Nsight is because it involves two commands, which I don't know how to put them in. Please let me know if you knows how to do these. Thanks!

OK, I figured out the solution. Here are the complete steps for compiling mex programs with "separate compilation mode" in Nsight:
Create a cuda project.
In the project level, change build option for the following:
Switch on -fPIC in the compiler option of "NVCC compiler" at the project level.
Add -dlink -Xcompiler '-fPIC' to "Expert Settings" "Command Line Pattern" of the linker "NVCC Linker"
Add letter o to "Build Artifact" -> "Artifact Extension", since by -dlink in the last step we are making the output a .o file.
Add mex -cxx -o path_to_mex_bin/mex_bin_filename ./*.o ./src/*.o -lcudadevrt to "Post Build Steps", (add other necessary libs)
UPDATE: In my actual project I moved the last step to a .m file in MATLAB, because otherwise if I do it while my mex program is running, it could cause MATLAB crash.
For files needs to be compiled with mex, change these build option for each of them:
Change the compiler to GCC C++ Compiler in Tool Chain Editor.
Go back to compiler setting of GCC C++ Compiler and change Command to mex
Change command line pattern to ${COMMAND} -c -outdir "src" ${INPUTS}
Several additional notes:
(1) Cuda specific details (such as kernel functions and calls to kernel functions) must be hidden from the mex compiler. So they should be put in the .cu files rather than the header files. Here is a trick to put templates involving cuda details into .cu files.
In the header file (e.g., f.h), you put only the declaration of the function like this:
template<typename ValueType>
void func(ValueType x);
Add a new file named f.inc, which holds the definition
template<>
void func(ValueType x) {
// possible kernel launches which should be hidden from mex
}
In the source code file (e.g., f.cu), you put this
#define ValueType float
#include "f.inc"
#undef ValueType
#define ValueType double
#include "f.inc"
#undef ValueType
// Add other types you want.
This trick can be easily generalized for templated classes to hide details.
(2) mex specific details should also be hidden from cuda source files, since the mex.h will alter the definitions of some system functions, such as printf. So including of "mex.h" should not appear in header files that can potentially be included in the cuda source files.
(3) In the mex source code file containing the entry mexFunction, one can use the compiler macro MATLAB_MEX_FILE to selectively compile code sections. This way th source code file can be compiled into both mex executable or ordinarily executable, allowing debugging under Nsight without matlab. Here is a trick for building multiple targets under Nsight: Building multiple binaries within one Eclipse project

First of all, it should be possible to set up Night to use a custom Makefile rather than generate it automatically. See Setting Nsight to run with existing Makefile project.
Once we have a custom Makefile, it may be possible to automate (1), (4), and (5). The advantage of a custom Makefile is that you know exactly what compilation commands will take place.
A bare-bones example:
all: mx.mexa64
mx.mexa64: mx.o
mex -o mx.mexa64 mx.o -L/usr/local/cuda/lib64 -lcudart -lcudadevrt
mx.o: mxfunc.o helper.o
nvcc -arch=sm_35 -Xcompiler -fPIC -o mx.o -dlink helper.o mxfunc.o -lcudadevrt
mxfunc.o: mxfunc.c
mex -c -o mxfunc.o mxfunc.c
helper.o: helper.c
nvcc -arch=sm_35 -Xcompiler -fPIC -c -o helper.o helper.c
clean:
rm -fv mx.mexa64 *.o
... where mxfunc.c contains the mxFunction but helper.c does not.
EDIT: You may be able achieve the same effect in the automatic compilation system. Right click on each source file and select Properties, and you'll get a window where you can add some compilation options for that individual file. For linking options, open Properties of the project. Do some experiments and pay attention to the actual compilation commands that show up in the console. In my experience, custom options sometimes interact with the automatic system in a weird way. If this method proves too troublesome for you, I suggest that you make a custom Makefile; this way, at least we are not caught by unexpected side-effects.

Related

How can get g++ to use my own glibc build's headers correctly?

There's a TL;DR at the end if the context is too much!
Context
I am trying to update the version of glibc a project uses to 2.23 (I know it's old, that's another issue). In order to do this, I need to swap out the libraries and use the associated interpreter.
I encountered some issues when swapping out the interpreter that looked like an ABI change, so I figured it was probably because the header files had changed somehow and started working on getting those included into the project.
At first I tried using -I to include the headers, but got an error (see below). Later I tried setting --sysroot, but this quickly felt like the wrong way of doing things since I was essentially reinventing what g++ already did with system headers. I later found another mechanism that looked more promising (see Problem section).
Could this be an XY issue? Absolutely, but either way, the problem I'm seeing seems odd to me.
Problem
I looked into whether there was a different mechanism to include headers for system libraries, such as glibc, in gcc and g++. I found the flag -isystem:
-isystem dir
Search dir for header files, after all directories specified by -I but before the standard system directories. Mark it as a system directory, so that it gets the same special treatment as is applied to the standard system directories. If dir begins with "=", then the "="
will be replaced by the sysroot prefix; see --sysroot and -isysroot.
I figured that this was probably wanted and set about intergrating this flag into the build system for the project. The resulting g++ command looks like this (simplified and broken onto multiple lines):
> /path/to/gcc-6.3.0/bin/g++
-c
-Wl,--dynamic-linker=/path/to/glibc-2.23/build/install/lib/ld-linux-x86-64.so.2
-Wl,--rpath=/path/to/glibc-2.23/build/install/lib
-isystem /path/to/glibc-2.23/build/install/include
-I.
-I/project-foo/include
-I/project-bar/include
-o example.o
example.cpp
This leads to the following error, followed by many similar ones:
In file included from /usr/include/math.h:71:0,
from /path/to/gcc-6.3.0/include/c++/6.3.0/cmath:45,
from example.cpp:42:
/path/to/glibc-2.23/build/install/include/bits/mathcalls.h:63:16: error: expected constructor, destructor, or type conversion before '(' token
__MATHCALL_VEC (cos,, (_Mdouble_ __x));
Looking into this, it appears that this particular math.h is incompatible with this version of glibc. The fact it tries to use it surprises me, because the math.h file exists in the glibc directory I specified; why didn't it use that? Here's how I verified that file exists:
> ls /path/to/glibc-2.23/build/install/include/math.h
/path/to/glibc-2.23/build/install/include/math.h
Research
I searched around on the internet for people with a similar issue and came across the following relevant things:
https://github.com/riscv/riscv-gnu-toolchain/issues/105
https://askubuntu.com/questions/806220/building-ucb-logo-6-errors-in-mathcalls-h
-isystem on a system include directory causes errors
The last of these is the most promising; it talks about why -isystem won't work here stating that the special #include_next traverses the include path in a different way. Here, the solution appears to be "don't use -isystem where you can help it", but since I've tried using -I only get get the same problem again, I'm not sure how I'd apply that here.
Original issue
When compiling with the new glibc, I get the following error (our build process ends up running some of the programs it compiles to generate further source to be compiled, hence this runtime error whilst compiling):
Inconsistency detected by ld.so: get-dynamic-info.h: 143: elf_get_dynamic_info: Assertion `info[DT_RPATH] == NULL' failed!
I found a couple of relevant things about this:
https://www.linuxquestions.org/questions/linux-software-2/how-to-get-local-gcc-to-link-with-local-glibc-404087/
https://www.linuxquestions.org/questions/programming-9/inconsistency-detected-by-ld-so-dynamic-link-h-62-elf_get_dynamic_info-assertion-621701/
The only solution I see there is completely recompiling gcc to use the new glibc. I'd like to avoid that if possible, which is what lead me down the include route.
Eliminating the complex build system
To try and eliminate the complex build system on the "real" project, I reproduced the problem using the following test.cpp file:
#include <cmath>
int main() {
}
Compiled using:
> /path/to/gcc-6.3.0/bin/g++ test.cpp -Wl,--dynamic-linker=/path/to/glibc-2.23/build/install/lib/ld-linux-x86-64.so.2 -Wl,--rpath=/path/to/glibc-2.23/build/install/lib
Running yields the same original issue:
> ./a.out
Inconsistency detected by ld.so: get-dynamic-info.h: 143: elf_get_dynamic_info: Assertion `info[DT_RPATH] == NULL' failed!
Trying to use the newer headers yields the same include issue:
> /path/to/gcc-6.3.0/bin/g++ test.cpp -Wl,--dynamic-linker=/path/to/glibc-2.23/build/install/lib/ld-linux-x86-64.so.2 -Wl,--rpath=/path/to/glibc-2.23/build/install/lib -isystem /path/to/glibc-2.23/build/install/include
In file included from /usr/include/math.h:71:0,
from /path/to/gcc-6.3.0/include/c++/6.3.0/cmath:45,
from test.cpp:1:
/path/to/glibc-2.23/build/install/include/bits/mathcalls.h:63:16: error: expected constructor, destructor, or type conversion before '(' token
__MATHCALL_VEC (cos,, (_Mdouble_ __x));
TL;DR
How can I get g++ to include the headers from my glibc build correctly, without it accidentally including incompatible files from /usr/include?
In your GCC version,<cmath> uses #include_next, which means that you need to make sure that the directory which contains the cmath file comes before (on the include search path) the directory with the proper math.h for the version of glibc you are building against.
You can use g++ -v to view the search path. In your case, it probably looks like this:
#include "..." search starts here:
#include <...> search starts here:
.
/project-foo/include
/project-bar/include
/path/to/glibc-2.23/build/install/include
/usr/include/c++/6
/usr/include/x86_64-linux-gnu/c++/6
/usr/lib/gcc/x86_64-linux-gnu/6/include
/usr/local/include
/usr/lib/gcc/x86_64-linux-gnu/6/include-fixed
/usr/include/x86_64-linux-gnu
/usr/include
If you configure glibc with --prefix=/usr and install it with DESTDIR=/path/to/glibc-2.23/build/install, its header files will be installed into the directory /path/to/glibc-2.23/build/install/usr/include. This means you should be able to use the -isysroot option, which rewrites the default /usr/include directory, resulting in the right ordering of the search path:
#include "..." search starts here:
#include <...> search starts here:
.
/project-foo/include
/project-bar/include
/usr/include/c++/6
/usr/include/x86_64-linux-gnu/c++/6
/usr/include/c++/6/backward
/usr/lib/gcc/x86_64-linux-gnu/6/include
/usr/lib/gcc/x86_64-linux-gnu/6/include-fixed
/path/to/glibc-2.23/build/install/usr/include

Strip debug info from linux shared library

I’m using GCC to compile a shared library for ARM Linux. Here’s my compiler options from CMakeLists.txt:
add_definitions( "-std=c++14 -fvisibility=hidden -fvisibility-inlines-hidden -Wall -Wno-psabi -march=native -mfpu=neon" )
I’ve just opened the resulting .so file in a disassembler. I was disappointed to see a lot of stuff there. It showed me names for everything, including all my internal classes, and functions that were never exported. Even the stuff from anonymous namespaces is still there. On the “Exports” tab on the disassembler, I only see a dozen of functions I actually export (plus just a few extra: .init_proc .term_proc _edata __bss_end__ and call_weak_fn).
On Windows, I only see these things if I have a PDB file for the module I’m disassembling. But I don’t ship my .PDB files.
I’d like the same behavior for GCC.
Is there a way to tell GCC to stop including that debug info (=the mangled name of each and every function) in the .so file, and actually compile these things into binary addresses?
You need to add -Wl,--strip-debug to your linker flags.

How do I tell GHC that when it, with the FFI, tries to compile a C++ file, it should look for a library in a particular folder?

I've got a Haskell file, Saturn.hs, and a C++ file hssaturn.cpp and hssaturn.h, in the directory src/Galakhsy/. hssaturn.cpp needs libsaturn.cpp and/or libsaturn.hpp, which are in lib/saturn/src/lib/.
I have no idea how to compile it properly, any pointers?
Compile all the C++ files to object files using g++ -c filename.cpp. This produces, in your case, hssaturn.o and libsaturn.o. Then compile your Haskell program with ghc --make -o whatever Saturn.hs hssaturn.o libsaturn.o. Also specify any shared libraries needed by the C++ stuff with -lblabla. You probably at least need the C++ standard library, i.e. -lstdc++, making the GHC command something like
ghc --make -o whatever Saturn.hs hssaturn.o libsaturn.o -lstdc++
(well, modulo the correct paths for the two object files).
Also remember to prevent name mangling by using extern "C" for the C++ functions you call from Haskell.
Addendum: The name libsaturn makes me think it is perhaps a library. You might want to consider compiling it as that and simply linking dynamically (with the -l switch to GHC as above).

Creation of.so of files

There is a set with - files with extension.с: avl_tree.c, buf_read.c, db_prep.c, file_process.c, global_header.c, traverser.c. Used include files are in folder/usr/gcc/4.4/bin/include except for jni.h, and libraries are in folder/usr/gcc/4.4/bin/lib. How from them to create.so the file (if it is possible specify all options in this command)? It me interests in communication by creation of native of methods by means of JNI.
You really should read the documentation of GCC. Notably invoking GCC. The program library howto is also relevant.
Very often, some builder is used to drive the build. GNU make is often used and has a good tutorial documentation. If your Makefile-s are complex, you may also want to use GNU remake to debug them (remake is a debugging variant for make).
You usually want to compile each individual C source file into position independent code because shared objects have PIC code. You can use
gcc -Wall -fPIC -o foo.pic.o foo.c
to compile a C source foo.c into a position independent object file foo.pic.o and you may need some other compiler options (e.g. -I to add include directories, or -D to define some preprocessor symbols, -g for debugging, and -O for optimizing).
I strongly suggest to enable almost all warnings with -Wall (and to improve your code till no warnings are given; this will improve a little bit your code's quality).
Then you have to link all these *.pic.o files together into a shared object with
gcc -shared *.pic.o -o foo.so
You can link some shared libraries into a shared object.
You may want to read Levine's book on linkers and loaders
Of course if you use GNU make you'll have rules in your Makefile for all this.
You could use GNU libtool also.
Maybe dlopen(3) could interest you.
The question should probably give more information.
Most sets of sources have a Makefile, configure script or some other item to set up to make the output (the .so library you want).
gcc -dynamic -o file.so file.c
will create an so file from one of the source files, but you probably want a single so from all of them.

how to execute haskell program in cygwin

I compiled my helloworld.hs and got a helloworld.o file, I tried ./helloworld, but it didn't work, so what is the right way to execute the helloworld?
I am using cygwin, I just write down $ ghc --make helloworld.hs and I get helloworld.hi, helloworld.exe.manifest, helloworld.o files, I don't know what do I need to do next...
Depending on whether you used a Cygwin ghc or a Windows native ghc, you got either a.out (a historical traditional name) or helloworld.exe. If you have a.out you'll need to rename it to something.exe to execute it on Windows.
You can easily tell ghc how to call the executable: ghc -o helloworld.exe --make helloworld.hs.
By the way ghc --help would have told you:
To compile and link a complete Haskell program, run the compiler like so:
    ghc-6.8.2 --make Main
where the module Main is in a file named Main.hs (or Main.lhs) in the current directory. The other modules in the program will be located and compiled automatically, and the linked program will be placed in the file a.out' (orMain.exe' on Windows).
As you haven't specified anything about how you compiled, such as for instance what compiler you're using, we can only guess.
The common way to get a .o (object) file out of ghc is using the -c switch; as the manual says, that means "do not link". The mnemonic is "compile only". Without linking, you have only a portion of a program, and it cannot be executed. Precisely what it needs to be linked against will depend on the particular object file, and some of that is filled in by default if you simply let the compiler run the linker. Linking separately is more complicated.

Resources