GCC/ELF - from where comes my symbol? - linux

There is an executable that is dynamically linked to number of shared objects. How can I determine, to which of them some symbol (imported into executable) belongs ?
If there are more than one possibility, could I silmulate ld and see from where it is being taken ?

Have a look at nm(1), objdump(1) and elfdump(1).

As well as the ones Charlie mentioned, "ldd" might do some of what you're looking for.

If you can relink the executable, the simplest way to find out where references and definitions come from is using ld -y flag. For example:
$ cat t.c
int main() { printf("Hello\n"); return 0; }
$ gcc t.c -Wl,-yprintf
/lib/libc.so.6: definition of printf
If you can not relink the executable, then run ldd on it, and then run 'nm -D' on all the libraries listed in order, and grep for the symbol you are interested in.

$LD_DEBUG=bindings my_program
That would print all the symbol bindings on the console.

Related

linux gcc linking, duplicate symbols? [duplicate]

Is there any way we can get gcc to detect a duplicate symbol in static libraries vs the main code (Or another static library ?)
Here's the situation:
main.c erroneously contained a function definition, e.g. with the signature uint foohash(const char*)
foo.c also contains a function definition with the signature uint foohash(const char*)
foo.c and other source files are compiled to a static util library, which the main program links in, i.e. something like:
gcc -o main main.o util.o -L ./libs -lfooutils
So, now main.o and libs/libfooutils.a both contain a foohash function. Presumably the linker found that symbol in main.o and doesn't bother looking for it elsewhere.
Is there any way we can get gcc to detect such a situation ?
Indeed as Simon Richter stated, --whole-archive option can be useful. Try to change your command-line to:
gcc -o main main.o util.o -L ./libs -Wl,--whole-archive -lfooutils -Wl,--no-whole-archive
and you'll see a multiple definition error.
gcc calls the ld program for linking. The relevant ld options are:
--no-define-common
--traditional-format
--warn-common
See the man page for ld. These should be what you need to experiment with to get the warnings sought.
Short answer: no.
GCC does not actually do anything with libraries. It is the task of ld, the linker (called automatically by GCC) to pull in symbols from libraries, and that's really a fairly dumb tool.
The linker has lots of complex jiggery pokery for combining different types of data from different sources, and supporting different file formats, and all the evil little details of binary executables, but in the end, all it really does is look for undefined symbols and find the definitions.
What you can do is a link trace (pass -t to gcc) to see what comes from where. Or else run nm on all the object files and libraries in your system, and write a script to detect duplicates.

Linux ELF - Why does normal linking run faster than 'ldd -r'?

I have an exe in which none of the code changed, but I am afraid that it links to symbols that no longer exist on its shared objects. I found two ways to test that:
Run ldd -r
Relink the exe
In some cases it seems like relinking is faster than running ldd -r what is the reason for this?
In some cases it seems like relinking is faster than running ldd -r what is the reason for this?
Consider a simple case: main.o calls foo() from libfoo.so, and is linked like this:
gcc main.o -L. -lfoo
The amount of work ld has to do: discover that foo is being called, find that it is defined in libfoo.so, done. Not very much work.
Now suppose that libfoo.so itself has been linked against libbar.so, and calls 10000000 different symbols from it.
What does ldd -r have to do? It will first look in a.out for any unresolved symbols (there is only one: foo), and find a definition for it in libfoo.so (easy). Next it has to consider every undefined symbol in libfoo.so, and find a definition for all of them as well (in libbar.so). That is about 1000000 times harder. Repeat for libbar.so, and every other library linked into it.
It should not then be very surprising then that under above conditions ld will take significantly less time than ldd -r.

Convincing gcc to ignore system libraries in favour of locally installed libraries

I am trying to build a simple executable that uses boost_serialization and boost_iostreams.
#include <fstream>
#include <iostream>
#include <boost/archive/xml_iarchive.hpp>
#include <boost/archive/xml_oarchive.hpp>
#include <boost/iostreams/filtering_stream.hpp>
#include <boost/iostreams/filter/gzip.hpp>
#include <boost/iostreams/device/file.hpp>
int main()
{
using namespace boost::iostreams;
filtering_ostream os;
os.push(boost::iostreams::gzip_compressor());
os.push(boost::iostreams::file_sink("emptyGzipBug.txt.gz"));
}
Unfortunately the system I am working with has a very outdated version of boost_serialization in /usr/lib/, and I have no way to change that.
I am fairly certain when I build the example using
g++ -o main main.cpp -lboost_serialization -lboost_iostreams
that the linker errors result because gcc uses the system version of boost_serialization rather than my locally installed version. Setting LIBRARY_PATH and LD_LIBRARY_PATH to /home/andrew/install/lib doesnt work. When i build using
g++ -o main main.cpp -L/home/andrew/install/lib -lboost_serialization -lboost_iostreams
then everything works.
My questions are:
How can I get gcc to tell me the filenames of the libraries its using?
Is it possible to setup the environment so that I dont have to specify the absolute path to my local boost on the command line of gcc.
PS After typing the below info, I thought I'd be kind and add what you need for your specific case:
g++ -Wl,-rpath,/home/andrew/install/lib -o main main.cpp -I/home/andrew/install/include -L/home/andrew/install/lib -lboost_serialization -lboost_iostreams
gcc itself doesn't care about the libraries. The linker does ;).
Even though the linker needs to find the shared libraries so it can resolve
symbols, it doesn't store the path of those libraries in the executable normally.
So, for a start, lets find out what is actually in the binary after you linked it:
$ readelf -d main | grep 'libboost'
0x0000000000000001 (NEEDED) Shared library: [libboost_serialization.so.1.54.0]
0x0000000000000001 (NEEDED) Shared library: [libboost_iostreams.so.1.54.0]
Just the names thus.
The libraries that are actually used are detemined by /lib/ld-linux.so.*
at run time:
$ ldd main | grep libboost
libboost_serialization.so.1.54.0 => /usr/lib/x86_64-linux-gnu/libboost_serialization.so.1.54.0 (0x00007fd8fa920000)
libboost_iostreams.so.1.54.0 => /usr/lib/x86_64-linux-gnu/libboost_iostreams.so.1.54.0 (0x00007fd8fa700000)
The path is found by looking in /etc/ld.so.cache (which is normally
compiled by running ldconfig). You can print its contents with:
ldconfig -p | grep libboost_iostreams
libboost_iostreams.so.1.54.0 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libboost_iostreams.so.1.54.0
libboost_iostreams.so.1.49.0 (libc6,x86-64) => /usr/lib/libboost_iostreams.so.1.49.0
libboost_iostreams.so (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libboost_iostreams.so
but since that is only the cached result of a previous look up,
you are more interested in the output of:
$ ldconfig -v 2>/dev/null | egrep '^[^[:space:]]|libboost_iostreams'
/lib/i386-linux-gnu:
/usr/lib/i386-linux-gnu:
/usr/local/lib:
/lib/x86_64-linux-gnu:
/usr/lib/x86_64-linux-gnu:
libboost_iostreams.so.1.54.0 -> libboost_iostreams.so.1.54.0
/lib32:
/usr/lib32:
/lib:
/usr/lib:
libboost_iostreams.so.1.49.0 -> libboost_iostreams.so.1.49.0
which shows the paths that it looked in before finding a result.
Note if you are linking a 64bit program and it would find a 32bit
library first (or visa versa) then that would be skipped as being
incompatible. Otherwise, the first one found is used.
The paths used to search are specified in /etc/ld.so.conf which is
read (usually at boot time, or after installing something new)
when running ldconfig as root.
However, precedence take paths specified as a colon separated list
of paths in the environment variable LD_LIBRARY_PATH.
For example, if I'd do:
$ export LD_LIBRARY_PATH=/tmp
$ cp /usr/lib/libboost_iostreams.so.1.49.0 /tmp/libboost_iostreams.so.1.54.0
$ ldd main | grep libboost_iostreams
libboost_iostreams.so.1.54.0 => /tmp/libboost_iostreams.so.1.54.0 (0x00007f621add8000)
then it finds 'libboost_iostreams.so.1.54.0' in /tmp (even though it was a libboost_iostreams.so.1.49.0).
Note that you CAN hardcode a path in your executable by passing -rpath to
the linker:
$ unset LD_LIBRARY_PATH
$ g++ -Wl,-rpath,/tmp -o main main.cpp -lboost_serialization -lboost_iostreams
$ ldd main | grep libboost_iostreams
libboost_iostreams.so.1.54.0 => /tmp/libboost_iostreams.so.1.54.0 (0x00007fbd8bcd8000)
which can be made visible with
$ readelf -d main | grep RPATH
0x000000000000000f (RPATH) Library rpath: [/tmp]
Note that LD_LIBRARY_PATH even takes precedence over -rpath, unless
you also passed -Wl,--disable-new-dtags, along with the -rpath and provided that you are linking an executable and your linker supports
this flag.
You can show the search paths that gcc uses during compile(link) time with the -print-search-dirs command line option:
$ g++ -print-search-dirs | grep libraries
libraries: =/usr/lib/gcc/x86_64-linux-gnu/4.7/:/usr/lib/gcc/x86_64-linux-gnu/4.7/../../../../x86_64-linux-gnu/lib/x86_64-linux-gnu/4.7/:/usr/lib/gcc/x86_64-linux-gnu/4.7/../../../../x86_64-linux-gnu/lib/x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/4.7/../../../../x86_64-linux-gnu/lib/../lib/:/usr/lib/gcc/x86_64-linux-gnu/4.7/../../../x86_64-linux-gnu/4.7/:/usr/lib/gcc/x86_64-linux-gnu/4.7/../../../x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/4.7/../../../../lib/:/lib/x86_64-linux-gnu/4.7/:/lib/x86_64-linux-gnu/:/lib/../lib/:/usr/lib/x86_64-linux-gnu/4.7/:/usr/lib/x86_64-linux-gnu/:/usr/lib/../lib/:/usr/lib/gcc/x86_64-linux-gnu/4.7/../../../../x86_64-linux-gnu/lib/:/usr/lib/gcc/x86_64-linux-gnu/4.7/../../../:/lib/:/usr/lib/
This can be influenced by adding -L command line options. If a library can't be found in a path specified with the -L option then it looks in paths found through the environment variable GCC_EXEC_PREFIX (see the man page for that) and if that fails it uses the environment variable LIBRARY_PATH.
When you run g++ with the -v option, it will print the LIBRARY_PATH used.
LIBRARY_PATH=/tmp/lib g++ -v -o main main.cpp -lboost_serialization -lboost_iostreams 2>&1 | grep LIBRARY_PATH
LIBRARY_PATH=/tmp/lib/../lib/:/usr/lib/gcc/x86_64-linux-gnu/4.7/:/usr/lib/gcc/x86_64-linux-gnu/4.7/../../../x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/4.7/../../../../lib/:/lib/x86_64-linux-gnu/:/lib/../lib/:/usr/lib/x86_64-linux-gnu/:/usr/lib/../lib/:/tmp/lib/:/usr/lib/gcc/x86_64-linux-gnu/4.7/../../../:/lib/:/usr/lib/
Finally, note that especially for boost (but in general) you should
use header files that match the correct version! So, if the library that you
link with at run time is version xyz you should have used an -I command line option to get g++ to find the corresponding header files, or things might not link or worse, result in unexplainable crashes.
-nodefaultlibs
Do not use the standard system libraries when linking. Only the
libraries you specify are passed to the linker, and options
specifying linkage of the system libraries, such as
-static-libgcc or -shared-libgcc, are ignored. The standard
startup files are used normally, unless -nostartfiles is used.
The compiler may generate calls to "memcmp", "memset", "memcpy"
and "memmove". These entries are usually resolved by entries in
libc. These entry points should be supplied through some other
mechanism when this option is specified.
Haven't used it myself but it sounds exactly like what was asked for.

How to hook without using dlsym in linux

I'm trying to hook some functions of glibc, like fopen, fread etc. But in the hook function, i have to use the same function as in glibc. Like this:
// this is my fopen
FILE *fopen(.....)
{
fopen(....);// this is glibc fopen
}
I have found one way to do this using dlsym, but in this way i have to replace all the glibc function calls with wrappers inside which call glibc function using dlsym.
I'm curious whether where is another way to do the same job without coding wrapper functions. I ever tryed this :
fopen.c
....fopen(..)
{
myfopen(..);
}
myfopen.c
myfopen(..)
{
fopen(...);// glibc version
}
main.c
int main()
{
fopen(...);
}
$ gcc -c *.c
$ gcc -shared -o libmyopen.so myopen.o
$ gcc -o test main.o fopen.o libmyopen.so
In my understanding, gcc will link from left to right as specified in the command line, so main.o will use fopen in fopen.o, fopen.o will use myfopen in libmyfopen.so, libmyfopen.so will use fopen in glibc. But when running, i got a segment fault, gdb shows there is a recusive call of fopen and myfopen. I'm a little confused. Can anyone explain why ?
my understanding, gcc will link from left to right as specified in the command line, so main.o will use fopen in fopen.o, fopen.o will use myfopen in libmyfopen.so, libmyfopen.so will use fopen in glibc
Your understanding is incorrect. The myfopen from libmyfopen.so will use the first definition of fopen available to it. In your setup, that definition will come from fopen.o linked into the test program, and you'll end up with infinite recursion, and a crash due to stack exhaustion.
You can observe this by running gdb ./test, running until crash, and using backtrace. You will see an unending sequence of fopen and myfopen calls.
the symbol fopen is not bond to that in libc when compiling
That is correct: in ELF format, the library records that it needs the symbol (fopen in this case) to be defined, but it doesn't "remember" or care which other module defines that symbol.
You can see this by running readelf -Wr libmyfopen.so | grep fopen.
That's different from windows DLL.
Yes.

Symbols from convenience library not getting exported in executable

I have a program, myprogram, which is linked with a static convenience library, call it libconvenience.a, which contains a function, func(). The function func() isn't called anywhere in myprogram; it needs to be able to be called from a plugin library, plugin.so.
The symbol func() is not getting exported dynamically in myprogram. If I run
nm myprogram | grep func
I get nothing. However, it isn't missing from libconvenience.a:
nm libconvenience/libconvenience.a | grep func
00000000 T func
I am using automake, but if I do the last linking step by hand on the command line instead, it doesn't work either:
gcc -Wl,--export-dynamic -o myprogram *.o libconvenience/libconvenience.a `pkg-config --libs somelibraries`
However, if I link the program like this, skipping the use of a convenience library and linking the object files that would have gone into libconvenience.a directly, func() shows up in myprogram's symbols as it should:
gcc -Wl,--export-dynamic -o myprogram *.o libconvenience/*.o `pkg-config --libs somelibraries`
If I add a dummy call to func() somewhere in myprogram, then func() also shows up in myprogram's symbols. But I thought that --export-dynamic was supposed to export all symbols regardless of whether they were used in the program or not!
I am using automake 1.11.1 and gcc 4.5.1 on Fedora 14. I am also using Libtool 2.2.10 to build plugin.so (but not the convenience library.)
I didn't forget to put -Wl,--export-dynamic in myprogram_LDFLAGS, nor did I forget to put the source that contains func() in libconvenience_a_SOURCES (some Googling suggests that these are common causes of this problem.)
Can somebody help me understand what is going on here?
I managed to solve it. It was this note from John Calcote's excellent Autotools book that pointed me in the right direction:
Linkers add to the binary product every object file specified explicitly on the command line, but they only extract from archives those object files that are actually referenced in the code being linked.
To counteract this behavior, one can use the --whole-archive flag to libtool. However, this causes all the symbols from all the system libraries to be pulled in also, causing lots of double symbol definition errors. So --whole-archive needs to be right before libconvenience.a on the linker command line, and it needs to be followed by --no-whole-archive so that the other libraries aren't treated that way. This is a bit difficult since automake and libtool don't really guarantee keeping your flags in the same order on the command line, but this line in Makefile.am did the trick:
myprogram_LDFLAGS = -Wl,--export-dynamic \
-Wl,--whole-archive,libconvenience/libconvenience.a,--no-whole-archive
If you need func to be in plugin.so, you should try and locate it there if possible. Convenience libraries are meant to be just that -- a convenience to link to an executable or lib as an intermediate step.

Resources