Valgrind: shared libraries for the executable program could not be loaded - linux

I have some weird memory related errors in my program.
It uses intel mkl and therefore depends upon some mkl specific shared libraries.
When I run my program, it segfaults after it has done most of the work. The segfaults occurs in the function call fclose() to a file pointer that is not null.
When I run my program through gdb, the stacktrace is not very useful.
I therefore wanted to run valgrind to find possible errors in my code.
But, I cannot run the executable from valgrind. It prints the following error message.
==52778== Memcheck, a memory error detector
==52778== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==52778== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==52778== Command: ./main.exe
==52778==
./main.exe: error while loading shared libraries: libmkl_intel_lp64.so: cannot open shared object file: No such file or directory
The shared library libmkl_intel_lp64.so is present in the same directory (as well as all other shared libraries that my executable depends upon).
How do I resolve this problem, so that I can valgrind my code?
Thanks.
Edit: I also set (and checked) the environment variable LD_LIBRARY_PATH to the current directory, but it did not help.
Edit: Running on Linux 64 bit, using intel compilers 2017

The shared library libmkl_intel_lp64.so is present in the same
directory (as well as all other shared libraries that my executable
depends upon).
How do I resolve this problem, so that I can valgrind my code?
valgrind provides much of it's own environment and wrappers for various functions in order to do its job. Since you have set LD_LIBRARY_PATH and are still experiencing problems finding your library, your other option is to provide the library search path within the executable itself using the linker option -rpath=/path/to/dir that contains the library. The addition to the compile string would be:
-Wl,-rpath=/path/to/dir /* that has libmkl_intel_lp64.so in it */
Then finding the library doesn't depend on the external environment or the hope that valgind will extend its library search to the current working directory.
(glad it worked)

Related

warning libgfortran.so.3 needed by may conflict with libgfortran.so.5

while compiling I get the following warning:
/usr/bin/ld: warning: libgfortran.so.3, needed by /usr/openmpi-4.0.3rc4/lib64/libmpi_usempi.so, may conflict with libgfortran.so.5
It does create the .exe but when executing it an error occurs:
ideal.exe: error while loading shared libraries: libgfortran.so.5: cannot open shared object file: No such file or directory
I search for it to try and link it but it didn't work
whereis libgfortran.so.5
libgfortran.so: /usr/lib64/libgfortran.so.3
I don't have much knowlegde about linux or compilers and I'm working on a SUSE server without sudo permission. The gnu fortran compiler I'm using is in my home directory /home/gomezmr/gcc . Does anyone know how to solve this? Thank you.
Your OpenMPI library was compiled for a different version of GCC/gfortran than the version you are using for compiling. The MPI library must be compiled for the same compiler version that you are using for compiling.
In simple cases it may happen that it will somehow work anyway, but problems like yours can happen. When using the mpi or mpi_f08 modules, the major release version must match (e.g. both GCC9 or both GCC 11,...).

GDB - Loading Debug information from external ".sym" files

I am attempting to carry out postmortem analysis of a crashed binary, "TestApp", on a linux system.
I have a copy of the binaries and and shared objects that are copied onto the device in a path:
/usr/public/target
This folder contains all the binaries in question in the directory structure used on the system under test, ie:
/usr/public/target/sbin/TestApp
/usr/public/target/lib/TestAppLib.so
/usr/public/target/usr/lib/TestAppAPILib.so
The automated build process strips the debug information from the binaries, and stores them in external, symbol files, all under:
/usr/public/target_external_symbols
So the symbol information for the above binaries would exist in files named:
/usr/public/target_external_symbols/sbin/TestApp.sym
/usr/public/target_external_symbols/lib/TestAppLib.so.sym
/usr/public/target_external_symbols/usr/lib/TestAppAPILib.so.sym
How do I get GDB to be aware of the existence of these external symbols and to load them?
I typically invoke GDB via:
gdb TestApp TestApp.core
I've referred to other articles on creating a test .gdbinit file and passing it to GDB via the -command argument, but it doesn't appear to work. Every time I attempt to get a backtrace from my core file, I get an indication from GDB that it cannot open the debug symbols. Any help in resolving this is appreciated.
(gdb) info shared
From To Syms Read Shared Object Library
0x78000000 0x780061e8 Yes (*) /usr/public/target/lib/TestAppLib.so
0x78010000 0x7806e60c Yes (*) /usr/public/target/usr/lib/TestAppAPILib.so
0x78070000 0x78091d2c Yes (*) /usr/public/target/lib/libm.so.2
(*): Shared library is missing debugging information.
Thank you.
There are commands set debug-file-directory, symbol-file and add-symbol-file to load debugging symbols from within a gdb session. The latter one might require the address where the shared library was loaded into memory.
Maybe during your build process 'gnu debuglinks' have been added to your binaries. This would mean that in the executables there's a path coded that directs gdb where to look for debug symbols. More can be found here.

How do I compile and link a 32-bit Windows executable using mingw-w64

I am using Ubuntu 13.04 and installed mingw-w64 using apt-get install mingw-w64. I can compile and link a working 64-bit version of my program with the following command:
x86_64-w64-mingw32-g++ code.cpp -o app.exe
Which generates a 64-bit app.exe file.
What binary or command line flags do I use to generate a 32-bit version of app.exe?
That depends on which variant of toolchain you're currently using. Both DWARF and SEH variants (which come starting from GCC 4.8.0) are only single-target. You can see it yourself by inspecting the directory structure of their distributions, i.e. they contain only the libraries with either 64- or 32-bit addressing, but not both. On the other hand, plain old SJLJ distributions are indeed dual-target, and in order to build 32-bit target, just supply -m32 flag. If that doesn't work, then just build with i686-w64-mingw32-g++.
BONUS
By the way, the three corresponding dynamic-link libraries (DLLs) implementing each GCC exception model are
libgcc_s_dw2-1.dll (DWARF);
libgcc_s_seh-1.dll (SEH);
libgcc_s_sjlj-1.dll (SJLJ).
Hence, to find out what exception model does your current MinGW-w64 distribution exactly provide, you can either
inspect directory and file structure of MinGW-w64 installation in hope to locate one of those DLLs (typically in bin); or
build some real or test C++ code involving exception handling to force linkage with one of those DLLs and then see on which one of those DLLs does the built target depend (for example, can be seen with Dependency Walker on Windows); or
take brute force approach and compile some test code to assembly (instead of machine code) and look for presence of references like ___gxx_personality_v* (DWARF), ___gxx_personality_seh* (SEH), ___gxx_personality_sj* (SJLJ); see Obtaining current GCC exception model.

What do I need to debug pthreads?

I want to debug pthreads on my custom linux distribution but I am missing something. My host is Ubuntu 12.04, my target is an i486 custom embedded Linux built with a crosstool-NG cross compiler toolset, the rest of the OS is made with Buildroot.
I will list the facts:
I can run multi-threaded applications on my target
Google Breakpad fails to create a crash report when I run a multi-threaded application on the target. The exact same application with the exact same build of Breakpad libraries will succeed when I run it on my host.
GDB fails to debug multithreaded applications on my target.
e.g.
$./gdb -n -ex "thread apply all backtrace" ./a.out --pid 716
dlopen failed on 'libthread_db.so.1' - /lib/libthread_db.so.1: undefined symbol: ps_lgetfpregs
GDB will not be able to debug pthreads.
GNU gdb 6.8
I don't think ps_lgetfpregs is a problem because of this.
My crosstool build created the libthread_db.so file, and I put it on the target.
My crosstool build created the gdb for my target, so it should have been linked against the same libraries that I run on the target.
If I run gdb on my host, against my test app, I get a backtrace of each running thread.
I suspect the problem with Breakpad is related to the problem with GDB, but I cannot substantiate this. The only commonality is lack of multithreaded debug.
There is some crucial difference between my host and target that stops me from being able to debug pthreads on the target.
Does anyone know what it is?
EDIT:
Denys Dmytriyenko from TI says:
Normally, GDB is not very picky and you can mix and match different
versions of gdb and gdbserver. But, unfortunately, if you need to
debug multi-threaded apps, there are some dependencies for specific
APIs...
For example, this is one of the messages you may see if you didn't
build GDB properly for the thread support:
dlopen failed on 'libthread_db.so.1' - /lib/libthread_db.so.1:
undefined symbol: ps_lgetfpregs GDB will not be able to debug
pthreads.
Note that this error is the same as the one that I get but he doesn't go in to detail about how to build GDB "properly".
and the GDB FAQ says:
(Q) GDB does not see any threads besides the one in which crash occurred;
or SIGTRAP kills my program when I set a breakpoint.
(A) This frequently
happen on Linux, especially on embedded targets. There are two common
causes:
you are using glibc, and you have stripped libpthread.so.0
mismatch between libpthread.so.0 and libthread_db.so.1
GDB itself does
not know how to decode "thread control blocks" maintained by glibc and
considered to be glibc private implementation detail. It uses
libthread_db.so.1 (part of glibc) to help it do so. Therefore,
libthread_db.so.1 and libpthread.so.0 must match in version and
compilation flags. In addition, libthread_db.so.1 requires certain
non-global symbols to be present in libpthread.so.0.
Solution: use
strip --strip-debug libpthread.so.0 instead of strip libpthread.so.0.
I tried a non-stripped libpthread.so.0 but it didn't make a difference. I will investigate any mismatch between pthread and thread_db.
This:
dlopen failed on 'libthread_db.so.1' - /lib/libthread_db.so.1: undefined symbol: ps_lgetfpregs
GDB will not be able to debug pthreads.
means that the libthread_db.so.1 library was not able to find the symbol ps_lgetfpregs in gdb.
Why?
Because I built gdb using Crosstoolg-NG with the "Build a static native gdb" option and this adds the -static option to gcc.
The native gdb is built with the -rdynamic option and this populates the .dynsym symbol table in the ELF file with all symbols, even unused ones. libread_db uses this symbol table to find ps_lgetfpregs from gdb.
But -static strips the .dynsym table from the ELF file.
At this point there are two options:
Don't build a static native gdb if you want to debug threads.
Build a static gdb and a static libthread_db (not tested)
Edit:
By the way, this does not explain why Breakpad in unable to debug multithreaded applications on my target.
Just a though... To use the gdb debugger, you need to compile your code with -g option. For instance, gcc -g -c *.c.

Why can't I directly start a shared library in Linux?

$ chmod +x libsomelibrary.so
$ ./libsomelibrary.so
Segmentation fault
$ gcc -O2 http://vi-server.org/vi/bin/rundll.c -ldl -o rundll
$ ./rundll ./libsomelibrary.so main
(application starts normally)
Why can't I just start libsomelibrary.so if it has usable entry point?
rundll.c is trivial:
void* d = dlopen(argv[1], RTLD_LAZY);
void* m = dlsym(d, argv[2]);
return ((int(*)(int,char**,char**))m)(argc-2, argv+2, envp);
Why is it not used internally when attempting to load a binary?
main is not the entry point recognised by the kernel or the dynamic linker - it is called by the startup code linked into your executable when you compile it (such startup code isn't linked into shared libraries by default).
The ELF header contains the start address.
Shared libraries are not designed to be directly runnable. They are designed to be linked into another codebase. It may have a usable entry point, but being executable entails more than merely having a usable entry point. The rundll utility is proof of this. Your second test shows that the shared library is indeed executable, but only after rundll does some work. If you are curious as to what all has to be done before the library code can be executed, take a look at the source code for rundll.
You can start shared libraries in Linux.
For example, if you start /lib/libc.so.6, it will print out its version number:
$ /lib/libc.so.6
GNU C Library stable release version 2.12, by Roland McGrath et al.
Copyright (C) 2010 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 4.5.0 20100520 (prerelease).
Compiled on a Linux 2.6.34 system on 2010-05-29.
Available extensions:
crypt add-on version 2.1 by Michael Glad and others
GNU Libidn by Simon Josefsson
Native POSIX Threads Library by Ulrich Drepper et al
BIND-8.2.3-T5B
libc ABIs: UNIQUE IFUNC
For bug reporting instructions, please see:
<http://www.gnu.org/software/libc/bugs.html>.
There must be something missing from your library.

Resources