Linux: How to find out which (sub) dependency of my library needs a specific library? - linux

The title may seem complicated.
I made a library to be loaded within a Tcl script. Now I need to transfer it to Ubuntu 12.04.
Tclsh gives the following error:
couldn't load file "/apollo/applications/Linux-PORT/i586/lib/libapmntwraptcl.so":
**libgeos-3.4.2.so**:
cannot open shared object file: No such file or directory
while executing "load $::env(ACCLIB)/libapmntwraptcl[info sharedlibextension]"
The library libgeos doesn't have the version 3.4.2 under Ubuntu 12.04. So I need to know which (sub) dependency of my library needs the famous libgeos-3.4.2.so, so that I can rebuild it or find an alternative.
Many thanks in advance.
Edit:
Thank you for your USEFUL answers. I already did ldd -v or -r. I have 200+ dependencies when I do ldd -r. The worst is, in the result list I see libgeos-3.3.8.so => /usr/lib/libgeos-3.3.8.so (0xb3ea9000) (version I have), but when I execute, Tclsh says
libgeos-3.4.2.so missing.
That's why I need something able to tell me the complete dependency tree of my library.
Could anyone give me a hint (not some useless showoff)?
Thank you so much.

You've accidentally (probably through no fault of your own) wandered into “DLL Hell”; the problem is that something that libapmntwraptcl.so depends on, possibly indirectly, does not have its dependencies satisfied. This sort of thing can be very difficult to solve precisely because the tools that know what went wrong (in particular, the system dynamic linker library) produce such little informative output by default.
What's even worse is that you have apparently multiple versions about. That's where DLL Hell reaches its worst incarnation. You need to be a detective to solve this; it's too hard to sensibly do remotely as many of the things that you poke your fingers at are determined by what previous steps said.
You need to identify exactly what versions you're loading, with ldd libapmntwraptcl.so (in your shell, not in Tcl). You also need to double check what your environment variables are immediately before the offending load command, as several of them can affect the loading process. The easiest way to do that is to put parray env just before the offending load, which will produce a dump of everything in the context where things could be failing; reading the manual page for ld.so will tell you a lot more about each of the possible candidates for trouble (there's many!).
You might also need to go through the list of libraries identified by the ldd program above and check whether each of those also has all their dependencies satisfied and in a way that you expect, and you should also bear in mind that failing to locate with ldd might not mean that the code actually fails. (That would be too easy.)
You can also try setting the LD_DEBUG environment variable to all before doing the load. That will produce quite a lot of information on standard out; maybe it will give you enough to figure out what is going wrong?
Finally, on Linux you need to bear in mind that there can be an RPATH set for a particular library (which can affect where it is found) and there's a system library cache which can also affect things.
I'm really sorry the error message isn't better. All I can really say is that it's exactly as much as Tcl is told about what went wrong, and its hardly anything.

Related

Why is "cabal build" so slow compared with "make"?

If I have a package with several executables, which I initially build using cabal build. Now I change one file that impacts just one executable, cabal seems to take about a second or two to examine each executable to see if it's impacted or not. On the other hand, make, given an equivalent number of executables and source files, will determine in a fraction of a second what needs to be recompiled. Why the huge difference? Is there a reason, cabal can't just build its own version of a makefile and go from there?
Disclaimer: I'm not familiar enough with Haskell or make internals to give technical specifics, but some web searching does offer some insight that lines up with my proposal (trying to avoid eliciting opinions by providing references). Also, I'm assuming your makefile is calling ghc, as cabal apparently would.
Proposal: I believe there could be several key reasons, but the main one is that make is written in C, whereas cabal is written in Haskell. This would be coupled with superior dependency checking from make (although I'm not sure how to prove this without looking at the source code). Other supporting reasons, as found on the web:
cabal tries to do a lot more than simply compiling, e.g. appears to take steps with regard to packaging (https://www.haskell.org/cabal/)
cabal is written in haskell, although the run time is written in C (https://en.wikipedia.org/wiki/Glasgow_Haskell_Compiler)
Again, not being overly familiar with make internals, make may simply have a faster dependency checking mechanism, thereby better tracking these changes. I point this out because from the OP it sounds like there is a significant enough difference to where cabal may be doing a blanket check against all dependencies. I suspect this would be the primary reason for the speed difference, if true.
At any rate, these are open source and can be downloaded from their respective sites (haskell.org/cabal/ and savannah.gnu.org/projects/make/) allowing anyone to examine specifics of the implementations.
It is also likely one could see a lot of variance in speed based upon the switches passed to the compilers in use.
HTH at least point you in the right direction.

How to inspect Haskell bytecode

I am trying to figure out a bug (a serious performance downgrade). Unfortunately, I wasn't able to figure out why by going back many different versions of my code.
I am suspecting it could be some modifications to libraries that I've updated, not to mention in the meanwhile I've updated to GHC 7.6 from 7.4 (and if anybody knows if some laziness behavior has changed I would greatly appreciate it!).
I have an older executable of this code that does not have this bug and thus I wonder if there are any tools to tell me the library versions I was linking to from before? Like if it can figure out the symbols, etc.
GHC creates executables, which are notoriously hard to understand... On my Linux box I can view the assembly code by typing in
objdump -d <executable filename>
but I get back over 100K lines of code from just a simple "Hello, World!" program written in Haskell.
If you happen to have the GHC .hi files, you can get some information about the executable by typing in
ghc --show-iface <hi filename>
This won't give you the assembly code, but you can get some extra information that may prove useful.
As I mentioned in the comment above, on Linux you can use "ldd" to see what C-system libraries you used in the compile, but that is also probably less than useful.
You can try to use a disassembler, but those are generally written to disassemble to C, not anything higher level and certainly not Haskell. That being said, GHC compiles to C as an intermediary (at least it used to; has that changed?), so you might be able to learn something.
Personally I often find view system calls in action much more interesting than viewing pure assembly. On my Linux box, I can view all system calls by running using strace (use Wireshark for the network traffic equivalent):
strace <program executable>
This also will generate a lot of data, so it might only be useful if you know of some specific place where direct real world communication (i.e., changes to a file on the hard disk drive) goes wrong.
In all honesty, you are probably better off just debugging the problem from source, although, depending on the actual problem, some of these techniques may help you pinpoint something.
Most of these tools have Mac and Windows equivalents.
Since much has changed in the last 9 years, and apparently this is still the first result a search engine gives on this question (like for me, again), an updated answer is in order:
First of all, yes, while Haskell does not specify a bytecode format, bytecode is also just a kind of machine code, for a virtual machine. So for the rest of the answer I will treat them as the same thing. The “Core“ as well as the LLVM intermediate language, or even WASM could be considered equivalent too.
Secondly, if your old binary is statically linked, then of course, no matter the format your program is in, no symbols will be available to check out. Because that is what linking does. Even with bytecode, and even with just classic static #include in simple languages. So your old binary will be no good, no matter what. And given the optimisations compilers do, a classic decompiler will very likely never be able to figure out what optimised bits used to be partially what libraries. Especially with stream fusion and such “magic”.
Third, you can do the things you asked with a modern Haskell program. But you need to have your binaries compiled with -dynamic and -rdynamic, So not only the C-calling-convention libraries (e.g. .so), and the Haskell libraries, but also the runtime itself is dynamically loaded. That way you end up with a very small binary, consisting of only your actual code, dynamic linking instructions, and the exact data about what libraries and runtime were used to build it. And since the runtime is compiler-dependent, you will know the compiler too. So it would give you everything you need, but only if you compiled it right. (I recommend using such dynamic linking by default in any case as it saves memory.)
The last factor that one might forget, is that even the exact same compiler version might behave vastly differently, depending on what IT was compiled with. (E.g. if somebody put a backdoor in the very first version of GHC, and all GHCs after that were compiled with that first GHC, and nobody ever checked, then that backdoor could still be in the code today, with no traces in any source or libraries whatsoever. … Or for a less extreme case, that version of GHC your old binary was built with might have been compiled with different architecture options, leading to it putting more optimised instructions into the binaries it compiles for unless told to cross-compile.)
Finally, of course, you can profile even compiled binaries, by profiling their system calls. This will give you clues about which part of the code acted differently and how. (E.g. if you notice that your new binary floods the system with some slow system calls where the old one just used a single fast one. A classic OpenGL example would be using fast display lists versus slow direct calls to draw triangles. Or using a different sorting algorithm, or having switched to a different kind of data structure that fits your work load badly and thrashes a lot of memory.)

Loading Linux libraries at runtime

I think a major design flaw in Linux is the shared object hell when it comes to distributing programs in binary instead of source code form.
Here is my specific problem: I want to publish a Linux program in ELF binary form that should run on as many distributions as possible so my mandatory dependencies are as low as it gets: The only libraries required under any circumstances are libpthread, libX11, librt and libm (and glibc of course). I'm linking dynamically against these libraries when I build my program using gcc.
Optionally, however, my program should also support ALSA (sound interface), the Xcursor, Xfixes, and Xxf86vm extensions as well as GTK. But these should only be used if they are available on the user's system, otherwise my program should still run but with limited functionality. For example, if GTK isn't there, my program will fall back to terminal mode. Because my program should still be able to run without ALSA, Xcursor, Xfixes, etc. I cannot link dynamically against these libraries because then the program won't start at all if one of the libraries isn't there.
So I need to manually check if the libraries are present and then open them one by one using dlopen() and import the necessary function symbols using dlsym(). This, however, leads to all kinds of problems:
1) Library naming conventions:
Shared objects often aren't simply called "libXcursor.so" but have some kind of version extension like "libXcursor.so.1" or even really funny things like "libXcursor.so.0.2000". These extensions seem to differ from system to system. So which one should I choose when calling dlopen()? Using a hardcoded name here seems like a very bad idea because the names differ from system to system. So the only workaround that comes to my mind is to scan the whole library path and look for filenames starting with a "libXcursor.so" prefix and then do some custom version matching. But how do I know that they are really compatible?
2) Library search paths: Where should I look for the *.so files after all? This is also different from system to system. There are some default paths like /usr/lib and /lib but *.so files could also be in lots of other paths. So I'd have to open /etc/ld.so.conf and parse this to find out all library search paths. That's not a trivial thing to do because /etc/ld.so.conf files can also use some kind of include directive which means that I have to parse even more .conf files, do some checks against possible infinite loops caused by circular include directives etc. Is there really no easier way to find out the search paths for *.so?
So, my actual question is this: Isn't there a more convenient, less hackish way of achieving what I want to do? Is it really so complicated to create a Linux program that has some optional dependencies like ALSA, GTK, libXcursor... but should also work without it! Is there some kind of standard for doing what I want to do? Or am I doomed to do it the hackish way?
Thanks for your comments/solutions!
I think a major design flaw in Linux is the shared object hell when it comes to distributing programs in binary instead of source code form.
This isn't a design flaw as far as creators of the system are concerned; it's an advantage -- it encourages you to distribute programs in source form. Oh, you wanted to sell your software? Sorry, that's not the use case Linux is optimized for.
Library naming conventions: Shared objects often aren't simply called "libXcursor.so" but have some kind of version extension like "libXcursor.so.1" or even really funny things like "libXcursor.so.0.2000".
Yes, this is called external library versioning. Read about it here. As should be clear from that description, if you compiled your binaries using headers on a system that would normally give you libXcursor.so.1 as a runtime reference, then the only shared library you are compatible with is libXcursor.so.1, and trying to dlopen libXcursor.so.0.2000 will lead to unpredictable crashes.
Any system that provides libXcursor.so but not libXcursor.so.1 is either a broken installation, or is also incompatible with your binaries.
Library search paths: Where should I look for the *.so files after all?
You shouldn't be trying to dlopen any of these libraries using their full path. Just call dlopen("libXcursor.so.1", RTLD_GLOBAL);, and the runtime loader will search for the library in system-appropriate locations.

Finding the shared library name to use with dlload

In my open-source project Artha I use libnotify for showing passive desktop notifications to the user.
Instead of statically linking libnotify, a lookup at runtime is made for the shared object (.so) file via dlload, if available on the target machine, Artha exposes the notification feature in it's GUI. On app. start, a call to dlload with filename param as libnotify.so.1 is made and if it returns a non-null pointer, then the feature is exposed.
A recurring problem with this model is that every time the version number of the library is bumped, Artha's code needs to be updated, currently libnotify.so.4 is the latest to entail such an occurance.
Is there a linux system call (irrespective of the distro the app. is running on), which can tell me if a particular library's shared object is available at runtime? I know that there exists the bruteforce option of enumerating the library by going from 1 to say 10, I find the solution ugly and inelegant.
Also, if this can be addressed via autoconf, then that solution is welcome too I.e. at build time, based on the target machine, the configure.h generated should've the right .so name that can be passed to dlload.
P.S.: I think good distros follow the style of creating links to libnotify.so.x so that a programmer can just do dlload("libnotify.so", RTLD_LAZY) and the right version numbered .so is loaded; unfortunately not all distros follow this, including Ubuntu.
The answer is: you don't.
dlopen() is not designed to deal with things like that, and trying to load whichever soversion you find on the system just because it happens to have the symbols you need is not a good way to do it.
Different sonames have different ABIs, and different ABIs means that you may be calling the same exact symbol name that is expecting a different set (or different size) of parameters, which will cause crashes or misbehaviour that are extremely difficult do debug.
You should have a read on how shared object versions work and what an ABI is.
The libfoo.so link is there for the link editor (ld) and is usually installed with the -devel packages for that reason; it might also very well not be a link but rather a text file with a linker script, often times on purpose to avoid exactly what you're trying to do.

Gurus say that LD_LIBRARY_PATH is bad - what's the alternative?

I read some articles about problems in using the LD_LIBRARY_PATH, even as a part of a wrapper script:
http://linuxmafia.com/faq/Admin/ld-lib-path.html
http://blogs.oracle.com/ali/entry/avoiding_ld_library_path_the
In this case - what are the recommended alternatives?
Thanks.
You can try adding:
-Wl,-rpath,path/to/lib
to the linker options. This will save you the need to worry about the LD_LIBRARY_PATH environment variable, and you can decide at compile time to point to a specific library.
For a path relative to the binary, you can use $ORIGIN, eg
-Wl,-rpath,'$ORIGIN/../lib'
($ORIGIN may not work when statically linking to shared libraries with ld, use -Wl,--allow-shlib-undefined to fix this)
I've always set LD_LIBRARY_PATH, and I've never had a problem.
To quote you first link:
When should I set LD_LIBRARY_PATH? The short answer is never. Why? Some users seem to set this environment variable because of bad advice from other users or badly linked code that they do not know how to fix.
That is NOT what I call a definitive problem statement. In fact it brings to mind I don't like it. [YouTube, but SFW].
That second blog entry (http://blogs.oracle.com/ali/entry/avoiding_ld_library_path_the) is much more forthcoming on the nature of the problem... which appears to be, in a nutshell, library version clashes ThisProgram requires Foo1.2, but ThatProgram requires Foo1.3, hence you can't run both programs (easily). Note that most of these problems are negated by a simple wrapper script which sets the LD_LIBRARY_PATH for just the executing shell, which is (almost always) a separate child process of interactive shell.
Note also that the alternatives are pretty well explained in the post.
I'm just confused as to why you would post a question containing links to articles which apparently answer your question... Do you have a specific question which wasn't covered (clearly enough) in either of those articles?
the answer is in the first article you quoted.
In UNIX the location of a library can be specified with the -L dir option to the compiler.
....
As an alternative to using the -L and -R options, you can set the environment variable LD_RUN_PATH before compiling the code.
I find that the existing answers to not actually answer the question in a straightforward way:
LD_RUN_PATH is used by the linker (see ld) at the time you link your software. It is used only if you have no -rpath ... on the command line (-Wl,rpath ... on the gcc command line). The path(s) defined in that variable are added to the RPATH entry in your ELF binary file. (You can see that RPATH using objdump -x binary-filename—in most cases it is not there though! It appears in my development binaries, but once the final version gets installed RPATH gets removed.)
LD_LIBRARY_PATH is used at runtime, when you want to specify a directory that the dynamic linker (see ldd) needs to search for libraries. Specifying the wrong path could lead to loading the wrong libraries. This is used in addition to the RPATH value defined in your binary (as in 1.)
LD_RUN_PATH really causes no security threat unless you are a programmer and don't know how to use it. As I am using CMake to build my software, the -rpath is used all the time. That way I do not have to install everything to run my software. ldd can find all the .so files automatically. (the automake environment was supposed to do that too, but it was not very good at it, in comparison.)
LD_LIBRARY_PATH is a runtime variable and thus you have to be careful with it. That being said, many shared object would be really difficult to deal with if we did not have that special feature. Whether it is a security threat, probably not. If a hacker takes a hold of your computer, LD_LIBRARY_PATH is accessible to that hacker anyway. What could happen is that you use the wrong path(s) in that variable, your binary may not load, but if it loads you may end up with a crashing binary or at least a binary that does not work quite right. One concern is that over time you get new versions of the library and you are likely to forget to remove the LD_LIBRARY_PATH which means you may be using an unsecure version of the library.
The one other possibility for security is if the hacker installs a fake library of the same name as what the binary is searching, library that includes all the same functions, but that has some of those functions replaced with sneaky code. He can get that library loaded by changing the LD_LIBRARY_PATH variable. Then it will eventually get executed by the hacker. Again, if the hacker can add such a library to your system, he's already in and probably does not need to do anything like that in the first place (since he's in he has full control of your system anyway.) Because in reality, if the hacker can only place the library in his account he won't do anything much (unless your Unix box is not safe overall...) If the hacker can replace one of your /usr/lib/... libraries, he already has full access to your system. So LD_LIBRARY_PATH is not needed.

Resources