How can I make the dynamic loader load a library with no versioning information for a library/executable that requires versioning information?
For example, say I am trying to run /bin/bash which requires symbol S with version X.Y.Z and libtinfo.so.6 provides symbol S but due to being built with a musl toolchain has no versioning information. Currently, this gives me the following error:
/bin/bash: /usr/local/x86_64-linux-musl/lib/libtinfo.so.6: no version information available (required by /bin/bash)
Inconsistency detected by ld.so: dl-lookup.c: 112: check_match: Assertion `version->filename == NULL || ! _dl_name_match_p (version->filename, map)' failed!
I am trying to avoid the process described here where I make a custom DSO that essentially maps all symbols (i.e. I would have to write out each symbol) to the appropriate symbol in the musl library. I have seen a lot of discussion about loading older versions of symbols in a DSO, but nothing about NO symbol versions.
Does this require me to recompile all binaries with versioned symbol so they don't include versioning information?
Thanks for your help!
Update
After some investigation, I found that /bin/bash has a handful of symbols that it gets from libtinfo.so.6 such as tgoto, tgetstr, tputs, tgetent, tgetflag, tgetnum, UP, BC, and PC. When the dynamic loader tries to find the correct version of these symbols (for example, tputs#NCURSES6_TINFO_5.0.19991023) in the musl-built libtinfo.so.6 it fails as there is no versioning information in that file.
I think I have the beginnings of a hack-y solution (hopefully there is a better one out there). Essentially, I make a DSO that I compile with a GNU toolchain and load with LD_PRELOAD. In this DSO, I open the musl-built libtinfo.so.6.1 with dlopen and use dlsym to get the needed symbols. These symbols are then made globally available. While there is no version information for libtinfo.so.6, there are version sections (.gnu.version and .gnu.version_r), and I am able to execute bash without any errors/warning. The DSO source is below:
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <dlfcn.h>
/* Functions */
static char *(*tgoto_internal)(const char *string, int x, int y);
static char *(*tgetstr_internal)(const char * id, char **area);
static int (*tputs_internal)(const char *string, int affcnt, int (*outc)(int));
static int (*tgetent_internal)(char *bufp, const char *name);
static int (*tgetflag_internal)(const char *id);
static int (*tgetnum_internal)(const char *id);
void __attribute__ ((constructor)) init(void);
/* Library Constructor */
void
init(void)
{
void *handle = dlopen("/usr/local/x86_64-linux-musl/lib/libtinfo.so.6.1", RTLD_LAZY);
tgoto_internal = dlsym(handle, "tgoto");
tgetstr_internal = dlsym(handle, "tgetstr");
tputs_internal = dlsym(handle, "tputs");
tgetent_internal = dlsym(handle, "tgetent");
tgetflag_internal = dlsym(handle, "tgetflag");
tgetnum_internal = dlsym(handle, "tgetnum");
}
char *
tgoto(const char *string, int x, int y)
{
return tgoto_internal(string, x, y);
}
char *
tgetstr(const char * id, char **area)
{
return tgetstr_internal(id, area);
}
int
tputs(const char *string, int affcnt, int (*outc)(int))
{
return tputs_internal(string, affcnt, outc);
}
int
tgetent(char *bufp, const char *name)
{
return tgetent_internal(bufp, name);
}
int
tgetflag(const char *id)
{
return tgetflag_internal(id);
}
int
tgetnum(const char *id)
{
return tgetnum_internal(id);
}
/* Objects */
char * UP = 0;
char * BC = 0;
char PC = 0;
However this solution doesn't seem to work all the time, and I still see the same warning as above when testing musl-built binaries, but this time, they don't crash the tests and just print a warning.
It should also be noted that I encountered a similar versioning error before with libreadline.so looking for versioning information in libtinfo.so. This seemed to have stemmed from my musl-built libreadline.so being the wrong version (8 instead of 7) and thus my configuration script went to the GNU libreadline.so which was version 7 and this tried to pull in the musl libtinfo.so which raised the error. Building libreadline.so.7 with the musl toolchain resolved this error perfectly.
Thanks to #LorinczyZsigmond for helping me arrive at the solution! Since they don't want to post a complete answer, I will to close the question.
The error:
/bin/bash: /usr/local/x86_64-linux-musl/lib/libtinfo.so.6: no version information available (required by /bin/bash)
Inconsistency detected by ld.so: dl-lookup.c: 112: check_match: Assertion `version->filename == NULL || ! _dl_name_match_p (version->filename, map)' failed!
tells us that /bin/bash is looking for libtinfo.so.6 in the musl lib directory. However, if we look at /bin/bash under ldd we see that in general it looks for DSO's in GNU's lib directory:
$ ldd /bin/bash
linux-vdso.so.1 (0x00007ffd485f7000)
libtinfo.so.6 => /lib/x86_64-linux-gnu/libtinfo.so.6 (0x00007f58ad8ba000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f58ad8b5000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f58ad6f4000)
/lib64/ld-linux-x86-64.so.2 => //lib64/ld-linux-x86-64.so.2 (0x00007f58ada22000)
When /bin/bash is run and the LD_LIBRARY_PATH environment variable points to the musl lib directory, the loader will try to resolve the libtinfo.so.6 dependency with musl's libtinfo.so.6, not GNU's. This causes a conflict since /bin/bash was linked against GNU's libtinfo.so.6 which has symbol versioning and perhaps more.
The fix, as said by #LorincyZsigmond, is:
locally compiled shared objects should be searched first by locally compiled programs, but be hidden from the 'default' programs.
So essentially I needed to not mix the GNU and musl libraries which I was doing by heavy-handedly setting LD_LIBRARY_PATH=/usr/local/x86_64-linux-musl/lib.
Instead of using LD_LIBRARY_PATH, I used the rpath linker option (-L/usr/local/x86_64-linux-musl/lib -Wl,-rpath,/usr/local/x86_64-linux-musl/lib) to hard-code the path to my musl libraries into the executable. This allows musl-built binaries to link against the DSO's the need while also allowing for GNU-built binaries to link against GNU-built DSOs (both of which are required when doing something like testing vim built from source).
As an aside: The rpath entries in an ELF's dynamic section are searched first.
Related
Im writing a shared library for a FreeBSD application.
This library gets loaded by LD_PRELOAD.
This application has multiple compile-versions, so some function offsets might change and my library wont work there.
Now i want to read the offsets at loading the library.
The offsets are changing, so i think my only way is to read the offsets of specific function names.
The offsets are simply the offsets of functions or labels.
Now the problem - how to do it?
Example
In the first version, i call the main version like that:
int(*main)(int argc, char *argv[])=(int(*)(int,char*[]))0x081F3XXX;
but in the second, the offset has changed:
int(*main)(int argc, char *argv[])=(int(*)(int,char*[]))0x08233XXX;
Programmers (me) are lazy and don't want to compile their libs for every version.. I want to create a lib, that is for every version!
I simply need the offsets of the functions via function name, the rest is no problem..
Thats how i call the library:
LD_PRELOAD="/path/to/library.so" ./executable
or
env LD_PRELOAD="/path/to/library.so" ./executable
Edit with test code
Here my testcode regarding to the comments:
Main.cpp:
#include <stdio.h>
void test() {
printf("Test done.\n");
}
int main(int argc, char * argv[]) {
printf("Program started\n");
test();
}
lib.cpp
#include <stdio.h>
#include <dlfcn.h>
void __attribute__ ((constructor)) my_load(void);
void my_load(void) {
printf("Library loaded\n");
printf("test - offset: 0x%x\n",dlsym(NULL,"test"));
}
test.sh
g++ main.cpp -o program
g++ -shared lib.cpp -o lib.so
env LD_PRELOAD="lib.so" ./program
-> Result:
Library loaded
test - offset: 0x0
Program started
Test done.
Does not seem as would it work :s
Edit 15:45
printf("test - offset: 0x%x\n",dlsym(dlopen("/home/test/test_proc/program",RTLD_GLOBAL),"test"));
This also does not work.. Maybe dlsym is the wrong way?
I reproduced your program on Mac OS X using Clang, and found a solution. First, the boring parts:
To make it compile cleanly I had to change your %x format specifier to %p for the pointer.
Then, on Mac OS X I had to pass RTLD_MAIN_ONLY as the first argument to dlsym(). I guess this is platform-dependent; on Linux it does seem to be NULL as you have.
Now, the meat of the fix!
You're searching with dlsym() for a symbol called test. But there is no such symbol in your application. Why? Because you're using C++, and C++ does "name mangling." You could use any number of tools to figure out the mangled name and try to load that with dlsym(), but it could change with different compilers. So instead, just inhibit name mangling by enclosing your test() function in extern "C":
extern "C" {
void test() {
printf("Test done.\n");
}
}
This fixed it for me:
$ DYLD_INSERT_LIBRARIES=lib.so ./program
Library loaded
test - offset: 0x1027d1eb0
Program started
Test done.
I would like to use ld's --build-id option in order to add build information to my binary. However, I'm not sure how to make this information available inside the program. Assume I want to write a program that writes a backtrace every time an exception occurs, and a script that parses this information. The script reads the symbol table of the program and searches for the addresses printed in the backtrace (I'm forced to use such a script because the program is statically linked and backtrace_symbols is not working). In order for the script to work correctly I need to match build version of the program with the build version of the program which created the backtrace. How can I print the build version of the program (located in the .note.gnu.build-id elf section) from the program itself?
How can I print the build version of the program (located in the .note.gnu.build-id elf section) from the program itself?
You need to read the ElfW(Ehdr) (at the beginning of the file) to find program headers in your binary (.e_phoff and .e_phnum will tell you where program headers are, and how many of them to read).
You then read program headers, until you find PT_NOTE segment of your program. That segment will tell you offset to the beginning of all the notes in your binary.
You then need to read the ElfW(Nhdr) and skip the rest of the note (total size of the note is sizeof(Nhdr) + .n_namesz + .n_descsz, properly aligned), until you find a note with .n_type == NT_GNU_BUILD_ID.
Once you find NT_GNU_BUILD_ID note, skip past its .n_namesz, and read the .n_descsz bytes to read the actual build-id.
You can verify that you are reading the right data by comparing what you read with the output of readelf -n a.out.
P.S.
If you are going to go through the trouble to decode build-id as above, and if your executable is not stripped, it may be better for you to just decode and print symbol names instead (i.e. to replicate what backtrace_symbols does) -- it's actually easier to do than decoding ELF notes, because the symbol table contains fixed-sized entries.
Basically, this is the code I've written based on answer given to my question. In order to compile the code I had to make some changes and I hope it will work for as many types of platforms as possible. However, it was tested only on one build machine. One of the assumptions I used was that the program was built on the machine which runs it so no point in checking endianness compatibility between the program and the machine.
user#:~/$ uname -s -r -m -o
Linux 3.2.0-45-generic x86_64 GNU/Linux
user#:~/$ g++ test.cpp -o test
user#:~/$ readelf -n test | grep Build
Build ID: dc5c4682e0282e2bd8bc2d3b61cfe35826aa34fc
user#:~/$ ./test
Build ID: dc5c4682e0282e2bd8bc2d3b61cfe35826aa34fc
#include <elf.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
#include <sys/stat.h>
#if __x86_64__
# define ElfW(type) Elf64_##type
#else
# define ElfW(type) Elf32_##type
#endif
/*
detecting build id of a program from its note section
http://stackoverflow.com/questions/17637745/can-a-program-read-its-own-elf-section
http://www.scs.stanford.edu/histar/src/pkg/uclibc/utils/readelf.c
http://www.sco.com/developers/gabi/2000-07-17/ch5.pheader.html#note_section
*/
int main (int argc, char* argv[])
{
char *thefilename = argv[0];
FILE *thefile;
struct stat statbuf;
ElfW(Ehdr) *ehdr = 0;
ElfW(Phdr) *phdr = 0;
ElfW(Nhdr) *nhdr = 0;
if (!(thefile = fopen(thefilename, "r"))) {
perror(thefilename);
exit(EXIT_FAILURE);
}
if (fstat(fileno(thefile), &statbuf) < 0) {
perror(thefilename);
exit(EXIT_FAILURE);
}
ehdr = (ElfW(Ehdr) *)mmap(0, statbuf.st_size,
PROT_READ|PROT_WRITE, MAP_PRIVATE, fileno(thefile), 0);
phdr = (ElfW(Phdr) *)(ehdr->e_phoff + (size_t)ehdr);
while (phdr->p_type != PT_NOTE)
{
++phdr;
}
nhdr = (ElfW(Nhdr) *)(phdr->p_offset + (size_t)ehdr);
while (nhdr->n_type != NT_GNU_BUILD_ID)
{
nhdr = (ElfW(Nhdr) *)((size_t)nhdr + sizeof(ElfW(Nhdr)) + nhdr->n_namesz + nhdr->n_descsz);
}
unsigned char * build_id = (unsigned char *)malloc(nhdr->n_descsz);
memcpy(build_id, (void *)((size_t)nhdr + sizeof(ElfW(Nhdr)) + nhdr->n_namesz), nhdr->n_descsz);
printf(" Build ID: ");
for (int i = 0 ; i < nhdr->n_descsz ; ++i)
{
printf("%02x",build_id[i]);
}
free(build_id);
printf("\n");
return 0;
}
Yes, a program can read its own .note.gnu.build-id. The important piece is the dl_iterate_phdr function.
I've used this technique in Mesa (the OpenGL/Vulkan implementation) to read its own build-id for use with the on-disk shader cache.
I've extracted those bits into a separate project[1] for easy use by others.
[1] https://github.com/mattst88/build-id
I try to compile a program I have to control a DAQ device. In Windows, g++ compile and links OK, but in Linux it doesn't. The linker (called by G++) displays:
g++ -Wall -o "acelerar-30-0" "acelerar-30-0.cpp" (en el directorio: /home/poly/)
/tmp/ccRLpB4q.o: In function `main':
acelerar-30-0.cpp:(.text+0x429): undefined reference to `AdxInstantAoCtrlCreate'
collect2: ld returned 1 exit status
Ha fallado la compilaciĆ³n.
The cpp file is this (cut):
include stdlib.h
include stdio.h
include math.h
include "compatibility.h"
include "bdaqctrl.h"
include "comunes.h"
using namespace Automation::BDaq;
define deviceDescription L"USB-4704,BID#0"
int32 channelStart = 0;
int32 channelCount = 1;
double voltaje[0];
int32 modo;
int32 ms;
int main(int argc, char* argv[])
{
if (argc!=3)
salidaerror(argv[0],1);
channelStart = atoi(argv[1]);
ms = atoi(argv[2]);
if (channelStart<0||channelStart>1||ms<10)
salidaerror(argv[0],1);
ErrorCode ret = Success;
InstantAoCtrl * instantAoCtrl = AdxInstantAoCtrlCreate();
...
I have been several hours on this, and can't find the answer. The SDK is for Debian/Ubuntu, and it has the same code for Linux and Windows.
Any hints? Thanks
EDIT: Removed some marks as the formatting was incorrect
In my (limited) experience, typical gcc behavior will require that you specify the library containing that function as an argument on the command line like so:
-lsome_library
This is required even if the library is in your library path (additional library paths can be specified with -L). Find the appropriate library file containing that function and use its filename minus extensions and leading "lib" in the argument format above.
I am working through a sample program that uses both C++ source code as well as CUDA. This is the essential content from my four source files.
matrixmul.cu (main CUDA source code):
#include <stdlib.h>
#include <cutil.h>
#include "assist.h"
#include "matrixmul.h"
int main (int argc, char ** argv)
{
...
computeGold(reference, hostM, hostN, Mh, Mw, Nw); //reference to .cpp file
...
}
matrixmul_gold.cpp (C++ source code, single function, no main method):
void computeGold(float * P, const float * M, const float * N, int Mh, int Mw, int Nw)
{
...
}
matrixmul.h (header for matrixmul_gold.cpp file)
#ifndef matrixmul_h
#define matrixmul_h
extern "C"
void computeGold(float * P, const float * M, const float * N, int Mh, int Mw, int Nw);
#endif
assist.h (helper functions)
I am trying to compile and link these files so that they, well, work. So far I can get matrixmul_gold.cpp compiled using:
g++ -c matrixmul_gold.cpp
And I can compile the CUDA source code with out errors using:
nvcc -I/home/sbu/NVIDIA_GPU_Computing_SDK/C/common/inc -L/home/sbu/NVIDIA_GPU_Computing_SDK/C/lib matrixmul.cu -c -lcutil_x86_64
But I just end up with two .O files. I've tried a lot of different ways to link the two .O files but so far it's a no-go. What's the proper approach?
UPDATE: As requested, here is the output of:
nm matrixmul_gold.o matrixmul.o | grep computeGold
nm: 'matrixmul.o': No such file
0000000000000000 T _Z11computeGoldPfPKfS1_iii
I think the 'matrixmul.o' missing error is because I am not actually getting a successful compile when running the suggested compile command:
nvcc -I/home/sbu/NVIDIA_GPU_Computing_SDK/C/common/inc -L/home/sbu/NVIDIA_GPU_Computing_SDK/C/lib -o matrixmul matrixmul.cu matrixmul_gold.o -lcutil_x86_64
UPDATE 2: I was missing an extern "C" from the beginning of matrixmul_gold.cpp. I added that and the suggested compilation command works great. Thank you!
Conventionally you would use whichever compiler you are using to compile the code containing the main subroutine to link the application. In this case you have the main in the .cu, so use nvcc to do the linking. Something like this:
$ g++ -c matrixmul_gold.cpp
$ nvcc -I/home/sbu/NVIDIA_GPU_Computing_SDK/C/common/inc \
-L/home/sbu/NVIDIA_GPU_Computing_SDK/C/lib \
-o matrixmul matrixmul.cu matrixmul_gold.o -lcutil_x86_64
This will link an executable binary called matrimul from matrixmul.cu, matrixmul_gold.o and the cutil library (implicitly nvcc will link the CUDA runtime library and CUDA driver library as well).
Im using Eclipse with the DDT plugin and DMD 2.06 as the compiler. When I try to to use functions like dlopen, dlsym usw I get "unresolved reference" errors, in C and GCC I fixed them by linking with -ldl, -lsdl usw... but the DMD2 compiler doesnt have options like that, is there another way to link with specific libraries?
btw I define the C functions the following way
extern(C)
{
/* From <dlfcn.h>
* See http://www.opengroup.org/onlinepubs/007908799/xsh/dlsym.html
*/
const int RTLD_NOW = 2;
void *dlopen(const(char)* file, int mode);
int dlclose(void* handle);
void *dlsym(void* handle, const(char*) name);
const(char)* dlerror();
}
would be happy about any help.
D does have link pragmas:
pragma(lib, "dl");
which will cause DMD to emit "-L-ldl" (or the system-appropriate link flag) to the linker. If the linker is order-sensitive (as ld is), you need to specify the pragmas in the order which you manually pass them.
Just pass -L-ldl.
Also, you don't need to redefine all of these. They are available in the core.sys.posix.dlfcn module.