Compiling/Linking CUDA and CPP Source Files - linux

I am working through a sample program that uses both C++ source code as well as CUDA. This is the essential content from my four source files.
matrixmul.cu (main CUDA source code):
#include <stdlib.h>
#include <cutil.h>
#include "assist.h"
#include "matrixmul.h"
int main (int argc, char ** argv)
{
...
computeGold(reference, hostM, hostN, Mh, Mw, Nw); //reference to .cpp file
...
}
matrixmul_gold.cpp (C++ source code, single function, no main method):
void computeGold(float * P, const float * M, const float * N, int Mh, int Mw, int Nw)
{
...
}
matrixmul.h (header for matrixmul_gold.cpp file)
#ifndef matrixmul_h
#define matrixmul_h
extern "C"
void computeGold(float * P, const float * M, const float * N, int Mh, int Mw, int Nw);
#endif
assist.h (helper functions)
I am trying to compile and link these files so that they, well, work. So far I can get matrixmul_gold.cpp compiled using:
g++ -c matrixmul_gold.cpp
And I can compile the CUDA source code with out errors using:
nvcc -I/home/sbu/NVIDIA_GPU_Computing_SDK/C/common/inc -L/home/sbu/NVIDIA_GPU_Computing_SDK/C/lib matrixmul.cu -c -lcutil_x86_64
But I just end up with two .O files. I've tried a lot of different ways to link the two .O files but so far it's a no-go. What's the proper approach?
UPDATE: As requested, here is the output of:
nm matrixmul_gold.o matrixmul.o | grep computeGold
nm: 'matrixmul.o': No such file
0000000000000000 T _Z11computeGoldPfPKfS1_iii
I think the 'matrixmul.o' missing error is because I am not actually getting a successful compile when running the suggested compile command:
nvcc -I/home/sbu/NVIDIA_GPU_Computing_SDK/C/common/inc -L/home/sbu/NVIDIA_GPU_Computing_SDK/C/lib -o matrixmul matrixmul.cu matrixmul_gold.o -lcutil_x86_64
UPDATE 2: I was missing an extern "C" from the beginning of matrixmul_gold.cpp. I added that and the suggested compilation command works great. Thank you!

Conventionally you would use whichever compiler you are using to compile the code containing the main subroutine to link the application. In this case you have the main in the .cu, so use nvcc to do the linking. Something like this:
$ g++ -c matrixmul_gold.cpp
$ nvcc -I/home/sbu/NVIDIA_GPU_Computing_SDK/C/common/inc \
-L/home/sbu/NVIDIA_GPU_Computing_SDK/C/lib \
-o matrixmul matrixmul.cu matrixmul_gold.o -lcutil_x86_64
This will link an executable binary called matrimul from matrixmul.cu, matrixmul_gold.o and the cutil library (implicitly nvcc will link the CUDA runtime library and CUDA driver library as well).

Related

How can I load a DSO with no versioning information?

How can I make the dynamic loader load a library with no versioning information for a library/executable that requires versioning information?
For example, say I am trying to run /bin/bash which requires symbol S with version X.Y.Z and libtinfo.so.6 provides symbol S but due to being built with a musl toolchain has no versioning information. Currently, this gives me the following error:
/bin/bash: /usr/local/x86_64-linux-musl/lib/libtinfo.so.6: no version information available (required by /bin/bash)
Inconsistency detected by ld.so: dl-lookup.c: 112: check_match: Assertion `version->filename == NULL || ! _dl_name_match_p (version->filename, map)' failed!
I am trying to avoid the process described here where I make a custom DSO that essentially maps all symbols (i.e. I would have to write out each symbol) to the appropriate symbol in the musl library. I have seen a lot of discussion about loading older versions of symbols in a DSO, but nothing about NO symbol versions.
Does this require me to recompile all binaries with versioned symbol so they don't include versioning information?
Thanks for your help!
Update
After some investigation, I found that /bin/bash has a handful of symbols that it gets from libtinfo.so.6 such as tgoto, tgetstr, tputs, tgetent, tgetflag, tgetnum, UP, BC, and PC. When the dynamic loader tries to find the correct version of these symbols (for example, tputs#NCURSES6_TINFO_5.0.19991023) in the musl-built libtinfo.so.6 it fails as there is no versioning information in that file.
I think I have the beginnings of a hack-y solution (hopefully there is a better one out there). Essentially, I make a DSO that I compile with a GNU toolchain and load with LD_PRELOAD. In this DSO, I open the musl-built libtinfo.so.6.1 with dlopen and use dlsym to get the needed symbols. These symbols are then made globally available. While there is no version information for libtinfo.so.6, there are version sections (.gnu.version and .gnu.version_r), and I am able to execute bash without any errors/warning. The DSO source is below:
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <dlfcn.h>
/* Functions */
static char *(*tgoto_internal)(const char *string, int x, int y);
static char *(*tgetstr_internal)(const char * id, char **area);
static int (*tputs_internal)(const char *string, int affcnt, int (*outc)(int));
static int (*tgetent_internal)(char *bufp, const char *name);
static int (*tgetflag_internal)(const char *id);
static int (*tgetnum_internal)(const char *id);
void __attribute__ ((constructor)) init(void);
/* Library Constructor */
void
init(void)
{
void *handle = dlopen("/usr/local/x86_64-linux-musl/lib/libtinfo.so.6.1", RTLD_LAZY);
tgoto_internal = dlsym(handle, "tgoto");
tgetstr_internal = dlsym(handle, "tgetstr");
tputs_internal = dlsym(handle, "tputs");
tgetent_internal = dlsym(handle, "tgetent");
tgetflag_internal = dlsym(handle, "tgetflag");
tgetnum_internal = dlsym(handle, "tgetnum");
}
char *
tgoto(const char *string, int x, int y)
{
return tgoto_internal(string, x, y);
}
char *
tgetstr(const char * id, char **area)
{
return tgetstr_internal(id, area);
}
int
tputs(const char *string, int affcnt, int (*outc)(int))
{
return tputs_internal(string, affcnt, outc);
}
int
tgetent(char *bufp, const char *name)
{
return tgetent_internal(bufp, name);
}
int
tgetflag(const char *id)
{
return tgetflag_internal(id);
}
int
tgetnum(const char *id)
{
return tgetnum_internal(id);
}
/* Objects */
char * UP = 0;
char * BC = 0;
char PC = 0;
However this solution doesn't seem to work all the time, and I still see the same warning as above when testing musl-built binaries, but this time, they don't crash the tests and just print a warning.
It should also be noted that I encountered a similar versioning error before with libreadline.so looking for versioning information in libtinfo.so. This seemed to have stemmed from my musl-built libreadline.so being the wrong version (8 instead of 7) and thus my configuration script went to the GNU libreadline.so which was version 7 and this tried to pull in the musl libtinfo.so which raised the error. Building libreadline.so.7 with the musl toolchain resolved this error perfectly.
Thanks to #LorinczyZsigmond for helping me arrive at the solution! Since they don't want to post a complete answer, I will to close the question.
The error:
/bin/bash: /usr/local/x86_64-linux-musl/lib/libtinfo.so.6: no version information available (required by /bin/bash)
Inconsistency detected by ld.so: dl-lookup.c: 112: check_match: Assertion `version->filename == NULL || ! _dl_name_match_p (version->filename, map)' failed!
tells us that /bin/bash is looking for libtinfo.so.6 in the musl lib directory. However, if we look at /bin/bash under ldd we see that in general it looks for DSO's in GNU's lib directory:
$ ldd /bin/bash
linux-vdso.so.1 (0x00007ffd485f7000)
libtinfo.so.6 => /lib/x86_64-linux-gnu/libtinfo.so.6 (0x00007f58ad8ba000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f58ad8b5000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f58ad6f4000)
/lib64/ld-linux-x86-64.so.2 => //lib64/ld-linux-x86-64.so.2 (0x00007f58ada22000)
When /bin/bash is run and the LD_LIBRARY_PATH environment variable points to the musl lib directory, the loader will try to resolve the libtinfo.so.6 dependency with musl's libtinfo.so.6, not GNU's. This causes a conflict since /bin/bash was linked against GNU's libtinfo.so.6 which has symbol versioning and perhaps more.
The fix, as said by #LorincyZsigmond, is:
locally compiled shared objects should be searched first by locally compiled programs, but be hidden from the 'default' programs.
So essentially I needed to not mix the GNU and musl libraries which I was doing by heavy-handedly setting LD_LIBRARY_PATH=/usr/local/x86_64-linux-musl/lib.
Instead of using LD_LIBRARY_PATH, I used the rpath linker option (-L/usr/local/x86_64-linux-musl/lib -Wl,-rpath,/usr/local/x86_64-linux-musl/lib) to hard-code the path to my musl libraries into the executable. This allows musl-built binaries to link against the DSO's the need while also allowing for GNU-built binaries to link against GNU-built DSOs (both of which are required when doing something like testing vim built from source).
As an aside: The rpath entries in an ELF's dynamic section are searched first.

macosx thread explicitly marked deleted

I'm building an application with C++11 threads, but I can't seem to get it to work with clang++ on MacOSX 10.9. Here is the simplest example I can find that causes the issues:
#include <thread>
#include <iostream>
class Functor {
public:
Functor() = default;
Functor (const Functor& ) = delete;
void execute () {
std::cerr << "running in thread\n";
}
};
int main (int argc, char* argv[])
{
Functor functor;
std::thread thread (&Functor::execute, std::ref(functor));
thread.join();
}
This compiles and runs fine on Arch Linux using g++ (version 4.9.2) with the following command-line:
$ g++ -std=c++11 -Wall -pthread test_thread.cpp -o test_thread
It also compiles and runs fine using clang++ (version 3.5.0, also on Arch Linux):
$ clang++ -std=c++11 -Wall -pthread test_thread.cpp -o test_thread
But fails on MacOSX 10.9.5, using XCode 6.1 (regardless of whether I include the -stdlib=libc++ option):
$ clang++ -std=c++11 -Wall -pthread test_thread.cpp -o test_thread
In file included from test_thread.cpp:1:
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/thread:332:5: error: attempt to use a deleted function
__invoke(_VSTD::move(_VSTD::get<0>(__t)), _VSTD::move(_VSTD::get<_Indices>(__t))...);
^
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/thread:342:5: note: in instantiation of function template specialization
'std::__1::__thread_execute<void (Functor::*)(), std::__1::reference_wrapper<Functor> , 1>' requested here
__thread_execute(*__p, _Index());
^
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/thread:354:42: note: in instantiation of function template specialization
'std::__1::__thread_proxy<std::__1::tuple<void (Functor::*)(), std::__1::reference_wrapper<Functor> > >' requested here
int __ec = pthread_create(&__t_, 0, &__thread_proxy<_Gp>, __p.get());
^
test_thread.cpp:19:15: note: in instantiation of function template specialization 'std::__1::thread::thread<void (Functor::*)(), std::__1::reference_wrapper<Functor> , void>'
requested here
std::thread thread (&Functor::execute, std::ref(functor));
^
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/type_traits:1001:5: note: '~__nat' has been explicitly marked deleted
here
~__nat() = delete;
^
1 error generated.
I can't figure out how to get around this, it seems like a compiler bug to me. For reference, the version of clang on that Mac is:
$ clang++ --version
Apple LLVM version 6.0 (clang-600.0.54) (based on LLVM 3.5svn)
Target: x86_64-apple-darwin13.4.0
Thread model: posix
Any ideas what it is I'm doing wrong?
Thanks!
Donald.
The standard does not require the std::thread constructor - or the similar std::async for that matter - to unwrap a reference_wrapper when passed as the first argument with a pointer-to-member-function the way std::bind does. Pass a pointer to Functor instead of a reference_wrapper. (See Library Active Issues list DR2219.)

c++ ~ shared object -> get host application offsets

Im writing a shared library for a FreeBSD application.
This library gets loaded by LD_PRELOAD.
This application has multiple compile-versions, so some function offsets might change and my library wont work there.
Now i want to read the offsets at loading the library.
The offsets are changing, so i think my only way is to read the offsets of specific function names.
The offsets are simply the offsets of functions or labels.
Now the problem - how to do it?
Example
In the first version, i call the main version like that:
int(*main)(int argc, char *argv[])=(int(*)(int,char*[]))0x081F3XXX;
but in the second, the offset has changed:
int(*main)(int argc, char *argv[])=(int(*)(int,char*[]))0x08233XXX;
Programmers (me) are lazy and don't want to compile their libs for every version.. I want to create a lib, that is for every version!
I simply need the offsets of the functions via function name, the rest is no problem..
Thats how i call the library:
LD_PRELOAD="/path/to/library.so" ./executable
or
env LD_PRELOAD="/path/to/library.so" ./executable
Edit with test code
Here my testcode regarding to the comments:
Main.cpp:
#include <stdio.h>
void test() {
printf("Test done.\n");
}
int main(int argc, char * argv[]) {
printf("Program started\n");
test();
}
lib.cpp
#include <stdio.h>
#include <dlfcn.h>
void __attribute__ ((constructor)) my_load(void);
void my_load(void) {
printf("Library loaded\n");
printf("test - offset: 0x%x\n",dlsym(NULL,"test"));
}
test.sh
g++ main.cpp -o program
g++ -shared lib.cpp -o lib.so
env LD_PRELOAD="lib.so" ./program
-> Result:
Library loaded
test - offset: 0x0
Program started
Test done.
Does not seem as would it work :s
Edit 15:45
printf("test - offset: 0x%x\n",dlsym(dlopen("/home/test/test_proc/program",RTLD_GLOBAL),"test"));
This also does not work.. Maybe dlsym is the wrong way?
I reproduced your program on Mac OS X using Clang, and found a solution. First, the boring parts:
To make it compile cleanly I had to change your %x format specifier to %p for the pointer.
Then, on Mac OS X I had to pass RTLD_MAIN_ONLY as the first argument to dlsym(). I guess this is platform-dependent; on Linux it does seem to be NULL as you have.
Now, the meat of the fix!
You're searching with dlsym() for a symbol called test. But there is no such symbol in your application. Why? Because you're using C++, and C++ does "name mangling." You could use any number of tools to figure out the mangled name and try to load that with dlsym(), but it could change with different compilers. So instead, just inhibit name mangling by enclosing your test() function in extern "C":
extern "C" {
void test() {
printf("Test done.\n");
}
}
This fixed it for me:
$ DYLD_INSERT_LIBRARIES=lib.so ./program
Library loaded
test - offset: 0x1027d1eb0
Program started
Test done.

shm_open() fails with EINVAL when creating shared memory in subdirectory of /dev/shm

I have a GNU/Linux application with uses a number of shared memory objects. It could, potentially, be run a number of times on the same system. To keep things tidy, I first create a directory in /dev/shm for each of the set of shared memory objects.
The problem is that on newer GNU/Linux distributions, I no longer seem to be able create these in a sub-directory of /dev/shm.
The following is a minimal C program with illustrates what I'm talking about:
/*****************************************************************************
* shm_minimal.c
*
* Test shm_open()
*
* Expect to create shared memory file in:
* /dev/shm/
* └── my_dir
*    └── shm_name
*
* NOTE: Only visible on filesystem during execution. I try to be nice, and
* clean up after myself.
*
* Compile with:
* $ gcc -lrt shm_minimal.c -o shm_minimal
*
******************************************************************************/
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <errno.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <sys/types.h>
int main(int argc, const char* argv[]) {
int shm_fd = -1;
char* shm_dir = "/dev/shm/my_dir";
char* shm_file = "/my_dir/shm_name"; /* does NOT work */
//char* shm_file = "/my_dir_shm_name"; /* works */
// Create directory in /dev/shm
mkdir(shm_dir, 0777);
// make shared memory segment
shm_fd = shm_open(shm_file, O_RDWR | O_CREAT, 0600);
if (-1 == shm_fd) {
switch (errno) {
case EINVAL:
/* Confirmed on:
* kernel v3.14, GNU libc v2.19 (ArchLinux)
* kernel v3.13, GNU libc v2.19 (Ubuntu 14.04 Beta 2)
*/
perror("FAIL - EINVAL");
return 1;
default:
printf("Some other problem not being tested\n");
return 2;
}
} else {
/* Confirmed on:
* kernel v3.8, GNU libc v2.17 (Mint 15)
* kernel v3.2, GNU libc v2.15 (Xubuntu 12.04 LTS)
* kernel v3.1, GNU libc v2.13 (Debian 6.0)
* kernel v2.6.32, GNU libc v2.12 (RHEL 6.4)
*/
printf("Success !!!\n");
}
// clean up
close(shm_fd);
shm_unlink(shm_file);
rmdir(shm_dir);
return 0;
}
/* vi: set ts=2 sw=2 ai expandtab:
*/
When I run this program on a fairly new distribution, the call to shm_open() returns -1, and errno is set to EINVAL. However, when I run on something a little older, it creates the shared memory object in /dev/shm/my_dir as expected.
For the larger application, the solution is simple. I can use a common prefix instead of a directory.
If you could help enlighten me to this apparent change in behavior it would be very helpful. I suspect someone else out there might be trying to do something similar.
So it turns out the issue stems from how GNU libc validates the shared memory name. Specifically, the shared memory object MUST now be at the root of the shmfs mount point.
This was changed in glibc git commit b20de2c3d9 as the result of bug BZ #16274.
Specifically, the change is the line:
if (name[0] == '\0' || namelen > NAME_MAX || strchr (name, '/') != NULL)
Which now disallows '/' from anywhere in the filename (not counting leading '/')
If you have a third party tool that was broken by this shm_open change, a brilliant coworker found a workaround : preload a library that overrides the shm_open call and swaps slashes for underscores. It does the same for shm_unlink as well, so the application can properly free shared memory when needed.
deslash_shm.cc :
#include <dlfcn.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <algorithm>
#include <string>
// function used in place of the standard shm_open() function
extern "C" int shm_open(const char *name, int oflag, mode_t mode)
{
// keep a function pointer to the real shm_open() function
static int (*real_open)(const char *, int, mode_t) = NULL;
// the first time in, ask the dynamic linker to find the real shm_open() function
if (!real_open) real_open = (int (*)(const char *, int, mode_t)) dlsym(RTLD_NEXT,"shm_open");
// take the name we were given and replace all slashes with underscores instead
std::string n = name;
std::replace(n.begin(), n.end(), '/', '_');
// call the real open function with the patched path name
return real_open(n.c_str(), oflag, mode);
}
// function used in place of the standard shm_unlink() function
extern "C" int shm_unlink(const char *name)
{
// keep a function pointer to the real shm_unlink() function
static int (*real_unlink)(const char *) = NULL;
// the first time in, ask the dynamic linker to find the real shm_unlink() function
if (!real_unlink) real_unlink = (int (*)(const char *)) dlsym(RTLD_NEXT, "shm_unlink");
// take the name we were given and replace all slashes with underscores instead
std::string n = name;
std::replace(n.begin(), n.end(), '/', '_');
// call the real unlink function with the patched path name
return real_unlink(n.c_str());
}
To compile this file:
c++ -fPIC -shared -o deslash_shm.so deslash_shm.cc -ldl
And preload it before starting a process that tries to use non-standard slash characters in shm_open:
in bash:
export LD_PRELOAD=/path/to/deslash_shm.so
in tcsh:
setenv LD_PRELOAD /path/to/deslash_shm.so

G++: linker doesnt seem to link correctly

I try to compile a program I have to control a DAQ device. In Windows, g++ compile and links OK, but in Linux it doesn't. The linker (called by G++) displays:
g++ -Wall -o "acelerar-30-0" "acelerar-30-0.cpp" (en el directorio: /home/poly/)
/tmp/ccRLpB4q.o: In function `main':
acelerar-30-0.cpp:(.text+0x429): undefined reference to `AdxInstantAoCtrlCreate'
collect2: ld returned 1 exit status
Ha fallado la compilación.
The cpp file is this (cut):
include stdlib.h
include stdio.h
include math.h
include "compatibility.h"
include "bdaqctrl.h"
include "comunes.h"
using namespace Automation::BDaq;
define deviceDescription L"USB-4704,BID#0"
int32 channelStart = 0;
int32 channelCount = 1;
double voltaje[0];
int32 modo;
int32 ms;
int main(int argc, char* argv[])
{
if (argc!=3)
salidaerror(argv[0],1);
channelStart = atoi(argv[1]);
ms = atoi(argv[2]);
if (channelStart<0||channelStart>1||ms<10)
salidaerror(argv[0],1);
ErrorCode ret = Success;
InstantAoCtrl * instantAoCtrl = AdxInstantAoCtrlCreate();
...
I have been several hours on this, and can't find the answer. The SDK is for Debian/Ubuntu, and it has the same code for Linux and Windows.
Any hints? Thanks
EDIT: Removed some marks as the formatting was incorrect
In my (limited) experience, typical gcc behavior will require that you specify the library containing that function as an argument on the command line like so:
-lsome_library
This is required even if the library is in your library path (additional library paths can be specified with -L). Find the appropriate library file containing that function and use its filename minus extensions and leading "lib" in the argument format above.

Resources