Any tools to find order of .o files to be linked in a project using gcc - linux

I am porting vc++ project to work on the Linux platform i am using g++ as my compiler. i resolved compiling issues on g++ and able to generate .o files for every source file in vc++ project now i have to link them to produce final executable
i can do that by
g++ file1.o file2.o -o file.out
but when i do that in my make file and execute it a lot of ld errors are coming dueto dependency's
is there any way i can figure it out order of giving the object files ?
are there any tools to do that or any vc++ project files which have the order ?

You say "vc++", but you are using "gcc" (usually that would be g++"). Likely you are missing one or more libraries, which you would specify with a "-l" option (documented as part of ld as well as gcc).
The distinction is important, because each wrapper (gcc and g++) adds the corresponding runtime library to the options it passes to ld.
The order of shared libraries (the most common form with Linux) supposedly does not matter (the linker makes two passes to resolve symbols). A while back, before shared libraries were common, I wrote a program (named liborder, and mentioned here) which analyzes a collection of ".o" objects and "-l" (static libraries) to print a recommended order for the "-l" options. That was workable for small programs, but not for complex ones. For example, Oracle's runtime libraries around 20 years ago were all static, and one needed a list of 15-20 libraries in the proper order to successfully link. My program could not handle that. However, since then, shared libraries (which do not have the problem with ordering) are common enough that I have not bothered to package liborder for use by others (it's still on a to-do list with a dozen other programs).
If your program uses symbols which are not in the standard library for C/C++, then you have to determine that yourself. I suppose one could have a program that searches all of the development libraries for a given symbol, but that appears wasteful, since only a tiny fraction would be pertinent. I see 200 of these in my /usr/lib.
Rather, I make it easy for me to see what my program is missing, by presenting the symbols from nm in readable form -
For C, I use scripts (here as "externs" and "imports") to check which symbols are exported or imported from a collection of ".o" files. The scripts use the output of the nm program, which shows the given symbols.
For C++, there's an option "-C" of nm which shows the unmangled names of symbols.

Related

How to compile ARM32 only binary (no thumb)

Is there a GCC configuration which will produce an executable only containing ARM32 code?
I know the -marm switch tells the compiler not to produce Thumb code, but it applies only to the user code of the program, while initialization routines (e.g. _start, frame_dummy, ...) still contain Thumb instructions.
I am using the Linaro cross compiler tool-chain (arm-linux-gnueabihf-) on a Linux x86-64 system.
EDIT :
While recompiling the tool-chain I found the (probable) solution myself. The initialization routines encoded as Thumb are part of glibc and can be found in the object files crt1.o, crti.o and crtbegin.o. I haven't tried recompiling it, but there may be a configuration value which forces the whole libc to be encoded as ARM32.
Is there a GCC configuration which will produce an executable only containing ARM32 code? I know the -marm switch ...
Your main problem is that that code (e.g. _start) is not produced by the compiler but it is already present pre-compiled (as thumb code).
If you want to have these functions to be non-thumb code you'll have to "replace" the existing files (thumb) by your own ones (non-thumb).
(You don't have to overwrite the existing files but you can instruct the linker to search for these files in a different directory.)
If you don't find pre-built non-thumb files you'll have to create them yourself (what may be a lot of work).

GNU Libraries- Which Library would have the _POSIX_OPEN_MAX symbol?

I am trying to learn more about Linux on the systemcall/interface level. Starting with limits, I read in APUE that limits such as _POSIX_OPEN_MAX are symbols. After googling, I read these symbols are in libraries. How do I find which library would have the _POSIX_OPEN_MAX symbol? I did find the header files with the limits, but I would like to learn how locate these in the compiled GNU libraries on my Linux System(using nm?). There are so many libraries I would not know where to begin to map out where these symbols would be.
_POSIX_OPEN_MAX is a macro and is replaced at compile time. In most cases, all information about macros is discarded after preprocessing, and so there is no symbol as such.
It is possible to make gcc include information about macros by using the -gdwarf-2 and -g3 flags, but it's very unlikely these options were used when building your system libraries. So, in short, you most likely won't find it in any of them.

GNU/Debian Linux and LD

Lets say I have a massive project which consists of multiple dynamic libraries that will all get installed to /usr/lib or /usr/lib64. Now lets say that one of the libraries call into another of the compiled libraries. If I place both of the libraries that are dependent on eachother in the same location will the ld program be able to allow the two libraries to call eachother?
The answer is perhaps yes, but it is a very bad design to have circular references between two libraries (i.e. liba.so containing function fa, calling function fb from libb.so, calling function ga from liba.so).
You should merge the two libraries in one libbig.so. And don't worry, libraries can be quite big. (some corporations have Linux libraries of several hundred megabytes of code).
The gold linker from package binutils-gold on Debian should be useful to you. It works faster than the older linker from binutils.
Yes, as long as their location is present in set of directories ld searches for libraries in. You can override this set by using LD_LIBRARY_PATH enviroment variable.
See this manual, it will resolve your questions.
If you mean the runtime dynamic linker /lib/ld-linux* (as opposed to /usr/bin/ld), it will look for libraries in your LD_LIBRARY_PATH, which typically includes /usr/lib and /usr/lib64.
In general, /lib/ld-* are used for .so libraries at run-time; /usr/bin/ld is used for .a libraries at compile-time.
However, if your libraries are using dlopen() or similar to find one another (e.g. plug-ins), they may have other mechanisms for finding one another. For example, many plug-in systems will use dlopen to read every library in a certain (one or many) directory/ies.

A question about how loader locates libraries at runtime

Only a minimum amount of work is done
at compile time by the linker; it only
records what library routines the
program needs and the index names or
numbers of the routines in the
library. (source)
So it means ld.so won't check all libraries in its database,only those recorded by the application programe itself, that is to say, only those specified by gcc -lxxx.
This contradicts with my previous knowledge that ld.so will check all libraries in its database one by one until found.
Which is the exact case?
I will make a stab at answering this question...
At link time the linker (not ld.so) will make sure that all the symbols the .o files that are being linked together are satisfied by the libraries the program is being linked against. If any of those libraries are dynamic libraries, it will also check the libraries they depend on (no need to include them in the -l list) to make sure that all of the symbols in those libraries are satisfied. And it will do this recursively.
Only the libraries the executable directly depends on via supplied -l parameters at link time will be recorded in the executable. If the libraries themselves declared dependencies, those dependencies will not be recorded in the executable unless those libraries were also specified with -l flags at link time.
Linking happens when you run the linker. For gcc, this usually looks something like gcc a.o b.o c.o -lm -o myprogram. This generally happens at the end of the compilation process. Underneath the covers it generally runs a program called ld. But ld is not at all the same thing as ld.so (which is the runtime loader). Even though they are different entities they are named similarly because they do similar jobs, just at different times.
Loading is the step that happens when you run the program. For dynamic libraries, the loader does a lot of jobs that the linker would do if you were using static libraries.
When the program runs, ld.so (the runtime loader) actually hooks the symbols up on the executable to the definitions in the shared library. If that shared library depends on other shared libraries (a fact that's recorded in the library) it will also load those libraries and hook things up to them as well. If, after all this is done, there are still unresolved symbols, the loader will abort the program.
So, the executable says which dynamic libraries it directly depends upon. Each of those libraries say which dynamic libraries they directly depend upon, and so forth. The loader (ld.so) uses that to decide which libraries to look in for symbols. It will not go searching through random other libraries in a 'database' to find the appropriate symbols. They must be in libraries that are in the dependency chain.

Are there any tools for checking symbols in cross compiled .so files?

I've got an application that loads .so files as plugins at startup, using dlopen()
The build environment is running on x86 hardware, but the application is being cross compiled for another platform.
It would be great if I could (as part of the automated build process) do a check to make sure that there aren't any unresolved symbols in a combination of the .so files and the application, without having to actually deploy the application.
Before I write a script to test symbols using the output of nm, I'm wondering if anyone knows of a utility that already does this?
edit 1: changed the description slightly - I'm not just trying to test symbols in the .so, but rather in a combination of several .so's and the application itself - ie. after the application loaded all of the .so's whether there would still be unresolved symbols.
As has been suggested in answers (thanks Martin v. Löwis and tgamblin), nm will easily identify missing symbols in a single file but won't easily identify which of those symbols has been resolved in one of the other loaded modules.
Ideally, a cross-nm tool is part of your cross-compiler suite. For example, if you build GNU binutils for cross-compilation, a cross-nm will be provided as well (along with a cross-objdump).
Could you use a recursive version of ldd for this? Someone seems to have written a script that might help. This at least tell you that all the dependency libs could be resolved, if they were specified in the .so correctly in the first place. You can guarantee that all the dependencies are referenced in the .so with linker options, and this plus recursive ldd would guarantee you no unresolved symbols.
Linkers will often have an option to make unresolved symbols in shared libraries an error, and you could use this to avoid having to check at all. For GNU ld you can just pass --no-allow-shlib-undefined and you're guaranteed that if it makes a .so, it won't have unresolved symbols. From the GNU ld docs:
--no-undefined
Report unresolved symbol references from regular object files.
This is done even if the linker is creating a non-symbolic shared
library. The switch --[no-]allow-shlib-undefined controls the
behaviour for reporting unresolved references found in shared
libraries being linked in.
--allow-shlib-undefined
--no-allow-shlib-undefined
Allows (the default) or disallows undefined symbols in shared
libraries. This switch is similar to --no-undefined except
that it determines the behaviour when the undefined symbols are
in a shared library rather than a regular object file. It does
not affect how undefined symbols in regular object files are
handled.
The reason that --allow-shlib-undefined is the default is that the
shared library being specified at link time may not be the
same as the one that is available at load time, so the symbols might
actually be resolvable at load time. Plus there are some systems,
(eg BeOS) where undefined symbols in shared libraries is normal.
(The kernel patches them at load time to select which function is most
appropriate for the current architecture. This is used for example to
dynamically select an appropriate memset function). Apparently it is
also normal for HPPA shared libraries to have undefined symbols.
If you are going to go with a post-link check, I agree with Martin that nm is probably your best bet. I usually just grep for ' U ' in the output to check for unresolved symbols, so I think it would be a pretty simple script to write.
The restrictions in nm turned out to mean that it wasn't possible to use for a comprehensive symbol checker.
In particular, nm would only list exported symbols.
However, readelf will produce a comprehensive list, along with all of the library dependencies.
Using readelf it was possible to build up a script that would:
Create a list of all of the libraries used,
Build up a list of symbols in an executable (or .so)
Build up a list of unresolved symbols - if there are any unresolved symbols at this point, there would have been an error at load time.
This is then repeated until no new libraries are found.
If this is done for the executable and all of the dlopen()ed .so files it will give a good check on unresolved dependencies that would be encountered at run time.

Resources