I am compiling a message passing program using openmpi with mpicxx on a Linux desktop. My makefile does the following:
mpicxx -c readinp.cpp
mpicxx -o exp_fit driver.cpp readinp.o
at which point i get the following error:
/usr/lib64/gcc/x86_64-suse-linux/4.5/../../../../x86_64-suse-linux/bin/ld: cannot find -lnuma
My questions are:
what is -lnuma? what is using it? how should i go about linking to it?
Thanks Jonathan Dursi!
On Ubuntu, the package name is libnuma-dev.
apt-get install libnuma-dev
The build script can't find the numa library - NUMA (Non Uniform Memory Access). The -l option tells the linker to link the library, but your system ether doesn't have the right one installed or your search path for the linker is incomplete/wrong.
Try querying your package-manager (apt or rpm) for a package libnuma.
OpenMPI, and I think mpich2, uses libnuma (`a simple programming interface to the NUMA (Non Uniform Memory Access) policy supported by the Linux kernel') for memory affinity -- to ensure that the memory for a particular MPI task stays close to the core that the task is running on, as vs. being kept in cache on another socket entirely. This is important for performance on multicore nodes.
You may need to use YaST to install libnuma-devel if your linker can't find the library.
I got the same error working on a remote server, which had the NUMA library installed. In particular, the file /usr/lib64/libnuma.so.1 existed. It appears that the linker only looked for the file under the name libnuma.so. Creating the symlink
ln -s /usr/lib64/libnuma.so.1 /usr/lib64/libnuma.so
as described here might have worked, but in my case I did not have permission to create files in /usr/lib64. I got around this by creating the symlink in some other location of which I have write permission:
ln -s /usr/lib64/libnuma.so.1 /some/path/libnuma.so
and then add this path to the compilation flags. In your case this would be
mpicxx -L/some/path -o exp_fit driver.cpp readinp.o
In my case of a larger build process (compiling fftw), I added the path to the LDFLAGS environment variable,
export LDFLAGS="${LDFLAGS} -L/some/path"
which fixed the issue.
Related
If you have an answer for this, or further information, I'd welcome it. I'm following advice from here, to offer some unsolicited help by posting this question then an answer I've already found for it.
I have a bare-metal ARM board for which I'm building a cross-toolchain, from sources for GNU binutils, gcc and gdb, and for SourceWare's Newlib. I got those four working and cross-built a DoNothing.c into an ELF file - but I couldn't disassemble it with this:
$ arm-none-eabi-objdump -S DoNothing.elf
The error was:
$ arm-none-eabi-objdump: error while loading shared libraries: libdebuginfod.so.1: cannot open shared object file: No such file or directory
I'll follow up with a solution.
The error was correct - my system didn't have libdebuginfod.so.1 installed - but I have another cross-binutils, installed from binary for a different target, and its objdump -S works fine on the same host. Why would one build of objdump complain about missing that shared library, when clearly not all builds of objdump need it?
First I tried rebuilding cross binutils, specifying --without-debuginfod as a configure option. No change, which seems odd: surely that should build tools that not only don't use debuginfod but which don't depend on it in any way. (If someone can answer that, or point out what I've misunderstood, it may help people.)
Next I figured debuginfod was inescapable (for my cross-tools built from source at least), so I'd install it to get rid of the error. It's a component of the elfutils package, but installing the latest elfutils available for my Ubuntu 20.04 system didn't bring libdebuginfod.so.1 with it.
I found a later one, for Arch Linux, whose package contents suggested it would - but its package format doesn't match Ubuntu's and installing it was going to involve a lot of work. Instead I opted to build it from the Arch Linux source package. However, running ./configure on that gave a couple of infuriatingly similar errors:
configure: checking libdebuginfod dependencies, --disable-libdebuginfod or --enable-libdebuginfo=dummy to skip
...
configure: error: dependencies not found, use --disable-libdebuginfod to disable or --enable-libdebuginfod=dummy to build a (bootstrap) dummy library.
No combination of those suggestions would allow configure for elfutils-0.182 to run to completion.
The problem of course was my own lack of understanding. The solution came from the Linux From Scratch project: what worked was to issue configure with both of the suggested options, like this:
$ ./configure --prefix=/usr \
--disable-debuginfod \
--enable-libdebuginfod=dummy \
--libdir=/lib
That gave a clean configure; make worked first time, as did make check and then sudo make install which of course installed libdebuginfod.so.1 as required. I then had an arm-none-eabi-objdump which disassembles cross-compiled ELF files without complaining.
I am installing Open MPI v1.8.8 with CUDA v7.5 on my Linux Debian.
I have tested CUDA and it works, tested OpenMPI and it works too. But when i try to combine them into a program, i meet an error: cannot find cuda.h file . This is my scenario:
My program source code include these .h file
include "cuda.h"
include "mpi.h"
I run command:
mpicc <filePath> -o test
And error appear:cuda.h: No such file or directory
#include "cuda.h"`
omp_info give me : mca:mpi:base:param:mpi_built_with_cuda_support:value:false
I have googled , and i followed some methods i found:
./configure --with-cuda
./configure --with-cuda=/usr/local/cuda-7.5
( source link : http://mirror.its.dal.ca/openmpi/faq/?category=buildcuda)
After that, i remake all , remake install Open Mpi. I run: mpicc or mpirun, the compiler give me error : mpirun error mca: base: component find: unable to open /usr/local/lib/openmpi/mca_mpool_sm
I set up soft link : ln -s /usr/local/cuda/include /usr/include ( describe in link : Building CUDA-aware openMPI on Ubuntu 12.04 cannot find cuda.h).
But it cannot fix my issue.
Does anyone successfully install it? Please help me or share your experience.
Thanks.
I think you are confusing installation problems with incorrect compiler options. It will be necessary to explicity specify the include paths, library paths, and libraries for CUDA when compiling and linking host code with your mpi wrapped host compiler.
Something like:
mpicc -I/usr/local/cuda-7.5/include -L/usr/local/cuda-7.5/lib -o test <filePath> -lcuda
would be the normal way to build a simple MPI program which call the cuda driver APIs. You will need to add nvcc compilation for device code and host code which uses the runtime API.
The apparent lack of CUDA support in your MPI flavour is a separate question and one you should probably take up in another forum (like the user mailing list of the MPI flavour you use).
I'm trying to install gcc4.9 on a SUSE system without an internet connection. I compiled gcc on an Ubuntu machine and installed it into a prefix, then copied the prefix folder to the SUSE machine. When I tried to run it gcc complained about not finding GLIBC_2_14, so I downloaded an rpm for libc6 online and included it into the prefix folders. my LD_LIBRARY_PATH includes prefix/lib and prefix/lib64. When I try to run any program now (ls, cp, cat, etc) I get the error error while loading shared libraries: /home/***/prefix/lib64/libc.so.6: unexpected reloc type 0x25.
Is there any way I can fix this so that I can get gcc4.9 up and running on this system?
As an alternative, is it possible to build gcc staticaly so that I don't have to worry about linking at all when I transfer it between computers?
my LD_LIBRARY_PATH includes prefix/lib and prefix/lib64
See this answer for explanation of why this can't work.
Is there any way I can fix this so that I can get gcc4.9 up and running on this system?
Your best bet is to install whatever GCC package comes with the SuSE system, then use that GCC to configure and install gcc-4.9 on it.
If for some reason you can't do that, this answer has some of the ways in which you can build gcc-4.9 on a newer system and have it still work on an older one.
is it possible to build gcc staticaly so that I don't have to worry about linking at all when I transfer it between computers?
Contrary to popular belief, fully-static binaries are generally less portable then dynamic ones on Linux.
When running:
sudo /sbin/ldconfig
the following error appears:
/sbin/ldconfig: /usr/local/lib/ is not a symbolic link
When I run the file command, the below appears:
file /usr/local/lib/
/usr/local/lib/: directory
Inside /usr/local/lib/ there are three libraries that I use. I'll call them here as lib1, lib2 and lib3.
Now, when I do an ldd on my binary it results:
lib1.so => not found
lib2.so => not found
lib3.so => /usr/local/lib/lib3.so (0x00216000)
But all of them are in the same folder as /usr/local/lib/{lib1,lib2,lib3}.so.
Every time I run ldconfig, the same error appears:
/usr/local/lib/ is not a symbolic link
I thought /usr/local/lib should be declared twice in /etc/ld.conf.d/*.conf, but not:
sudo egrep '\/usr\/local' /etc/ld.so.conf.d/*
projectA.conf.old:/usr/local/projectA/lib
local.conf:/usr/local/lib
ld.so.conf only includes /etc/ld.so.conf.d/*.conf, so this *.old isn't processed, and it refers to /usr/local/projectA/lib.
After a time trying I deleted all lib1 and lib2 (at some point I tested it on binary's folder), the same error occurs.
I ran into this issue with the Oracle 11R2 client. Not sure if the Oracle installer did this or someone did it here before I arrived. It was not 64-bit vs 32-bit, all was 64-bit.
The error was that libexpat.so.1 was not a symbolic link.
It turned out that there were two identical files, libexpat.so.1.5.2 and libexpat.so.1. Removing the offending file and making it a symlink to the 1.5.2 version caused the error to go away.
Makes sense that you'd want the well-known name to be a symlink to the current version. If you do this, it's less likely that you'll end up with a stale library.
I simply ran the command below:
export LD_LIBRARY_PATH=/usr/lib/
Now it is working fine.
Solved, at least at the point of the question.
I searched in the web before asking, and there were no conclusive solution, the reason why this error is: lib1.so and lib2.so are not OK, very probably where not compiled for a 64 bit PC, but for a 32 bits machine otherwise lib3.so is a 64 bits lib. At least that is my hypothesis.
VERY unfortunately ldconfig doesn't give a clean error message informing that it could not load the library, it only pumps:
ldconfig: /folder_where_the_wicked_lib_is/ is not a symbolic link
I solved this when I removed the libs not found by ldd over the binary. Now it's easier that I know where lies the problem.
My ld version:
GNU ld version 2.20.51, and I don't know if a most recent version has a better message for its users.
Thanks.
You need to include the path of the libraries inside /etc/ld.so.conf, and rerun ldconfig to upate the list
Other possibility is to include in the env variable LD_LIBRARY_PATH the path to your library, and rerun the executable.
check the symbolic links if they point to a valid library ...
You can add the path directly in /etc/ld.so.conf, without include...
run ldconfig -p to see whether your library is well included in the cache.
I have also faced the same issue,
The solution for it is :
the file for which you are getting the error is probably a duplicated file of the actual file with another version. So just the removal of a particular file on which errors are thrown can resolve the issue.
simple run in shell : sudo apt-get install --reinstall libexpat1
got same problem with libxcb - solved in this way - very fast :)
Good afternoon,
I am having difficulties with libxml2.
I tried to build the Perl module XML-LibXML which is part of our standard runtime environment. However, this time the installation on a RHEL5 box failed, because the build process complained about missing libxml2:
$> perl Makefile.PL LIB=/foo/lib/perl PREFIX=/foo INSTALLDIRS=site
enable native perl UTF8
running xml2-config...ok (2.7.6)
looking for -lxml2... no
looking for -llibxml2... no
libxml2 not found
However, the file was available. Starting the build with
perl Makefile.PL LIB=/usr/inform/target/lib/perl PREFIX=/usr/inform/target INSTALLDIRS=site
led to more evidence of the real problem:
[...]
Can't load 'blib/arch/auto/Conftest/Conftest.so' for module Conftest: /usr/inform/target/lib/libxml2.so.2: cannot restore segment prot after reloc: Permission denied at /usr/lib/perl5/5.8.8/i386-linux-thread-multi/DynaLoader.pm line 230.
at test.pl line 2
[...]
After some investigations I found that the problem appears to be that libxml2.so is created with text relocation:
[tess91#INF-AW] lib$ eu-findtextrel libxml2.so.2.7.6
the file containing the function 'get_crc_table' is not compiled with -fpic/-fPIC
the file containing the function 'crc32' is not compiled with -fpic/-fPIC
the file containing the function 'gzerror' is not compiled with -fpic/-fPIC
[...]
Ans since we have SElinux active on the target machine, linking against libxml.2 failed!
Is there any possibility to create libxml2 properly, or do I have to ask the admin to twist SElinux to allow relocations?
I really can't believe I am the olny one having this problem on Linux with SElinux active. What am I missing?
Any help apprecitated!
Regards,
Stefan
The simplest way is to have your administrator yum install libxml2-devel or even yum install perl-XML-LibXML. Otherwise, see if you can add -fPIC to the CFLAGS in the Makefile.PL.
I assume you are on 32-bit x86, any other architecture wouldn't work without -fPIC.
I just found a possible explanation:
During the build of libxml2 the compiler flag -fPIC is indeed used, so the code is created position independant, BUT:
When creating the shared library, the static libz is linked against it. Is that the source of my problem? That including a static lib in a shared executable taints the library by introducing non-relocatable code?
The fact that the symbols eu-findtextrel should already have pointed me in that direction, since crc32, get_crc_table, etc. look like encryption centered code...