MSVC: inspecting static libraries (fixing unresolved external symbols) - visual-c++

I wanted to ask what tools and techniques you use to fix linker errors in MSVC. My problem is, that I link an executable against a self built static lib and I get a lot of unresolved external symbols (LNK2019). I've tried building my libs with different calling conventions but I didn't succeed.
I want to inspect that lib file (it's a debug build) and see what functions are made available by that lib to maybe find the cause of the linker error.
I'd appreciate any suggestions how to systematically debug missing external symbols.
thanks,
Norbert

Usually these are due to compiler switches or options being different between multiple pieces. Make sure you use the same compiler switch for code generation in all of them: especially the runtime libraries need to be the same (under C/C++ in Runtime Library -- Multi Threaded (/MT) (static) or Multi Threaded DLL (/MD)). This indicates that you want to statically link agains the Microsoft runtime or not.

Related

Shared library symbol conflicts and static linking (on Linux)

I'm encountering an issue which has been elaborated in a good article Shared Library Symbol Conflicts (on Linux). The problem is that when the execution and .so have defined the same name functions, if the .so calls this function name, it would call into that one in execution rather than this one in .so itself.
Let's talk about the case in this article. I understand the DoLayer() function in layer.o has an external function dependency of DoThing() when compiling layer.o.
But when compiling the libconflict.so, shouldn't the external function dependency be resolved in-place and just replaced with the address of conflict.o/DoThing() statically?
Why does the layer.o/DoLayer() still use dynamic linking to find DoThing()? Is this a designed behavior?
Is this a designed behavior?
Yes.
At the time of introduction of shared libraries on UNIX, the goal was to pretend that they work just as if the code was in a regular (archive) library.
Suppose you have foo() defined in both libfoo and libbar, and bar() in libbar calls foo().
The design goal was that cc main.c -lfoo -lbar works the same regardless of whether libfoo and libbar are archive or a shared libraries. The only way to achieve this is to have libbar.so use dynamic linking to resolve call from bar() to foo(), despite having a local version of foo().
This design makes it impossible to create a self-contained libbar.so -- its behavior (which functions it ends up calling) depends on what other functions are linked into the process. This is also the opposite of how Windows DLLs work.
Creating self-contained DSOs was not a consideration at the time, since UNIX was effectively open-source.
You can change the rules with special linker flags, such as -Bsymbolic. But the rules get complicated very quickly, and (since that isn't the default) you may encounter bugs in the linker or the runtime loader.
Yes, this is a designed behavior. When you link a program into a binary, all the references to named external (non-static) functions are resolved to point into the symbol table for the binary. Any shared libraries that are linked against are specified as DT_NEEDED entries.
Then, when you run the binary, the dynamic linker loads each required shared library to a suitable address and resolves each symbol to an address. Sometimes this is done lazily, and sometimes it is done once at first startup. If there are multiple symbols with the same name, one of them will be chosen by the linker, and your program will likely crash since you may not end up with the right one.
Note that this is the behavior on Linux, which has all symbols as a flat namespace. Windows resolves symbols differently, using a tree topology, which has both advantages (fewer conflicts) and disadvantages (the inability to allocate memory in one library and free it in another).
The Linux behavior is very important if you want things like LD_PRELOAD to work. This allows you to use debugging tools like Electric Fence and CPU profiling tools like the Google performance tools, or replace a memory allocator at runtime. None of these things would work if symbols were preferentially resolved to their binary or shared library.
The GNU dynamic linker does support symbol versions, however, so that it's possible to load multiple versions of a shared library into the same program. Oftentimes distros like Debian will do this with libraries they expect to change frequently, like OpenSSL. If the program uses liba which uses OpenSSL 1.0 and libb which uses OpenSSL 1.1, then the program should still function in such a case since OpenSSL has versioned symbols, and each library will use the appropriate version of the relevant symbol.

mixing code compiled with /MT and /MD

I have a large body of code, compiled with /MT (i.e. expecting to statically link against the CRT). I need to combine this with a static third-party library, which has been built with /MD (i.e. expecting to link the CRT dynamically).
Is it theoretically possible to link the two into one executable without recompiling either?
If I link with /nodefaultlib:msvcrt, I end up with a small number of undefined references to things like __imp__wgetenv. I'm tempted to try implementing those functions in my own code, forwarding to wgetenv, etc. Is that worth trying, or will I run straight into the next problem?
Unfortunately I'm Forbidden from taking the easy option of packing the thirdparty code into a separate DLL :-/
No. /MT and /MD are mutually exclusive.
All modules passed to a given invocation of the linker must have been compiled with the same run-time library compiler option (/MD, /MT, /LD).
Source
I found such solution in OpenSSL sources: All obj files of the library are compiled with combination: /MT /Zl. As author described, such combination allows to build static library with ability to compile with applications either dynamic CRT (/MD) or static CRT (/MT).
I faced similar situation where in I had two libraries one was built with MT and another one with MD. I had to build an executable which uses functionalities from both the libraries. The library built as MD was third party thus I couldn't rebuilt it and library built as MT has many dependencies and to built all of them as MD is a big pain. I was getting error from the third party config header file which made it mandatory to built the executable as MD. I was looking for the easy way of packaging third party dll as a separate dll as mentioned in question. However, I couldn't find enough explanation online on this easy way. Hence my two cents below.
The following is the way I circumvent it
I built another .dll which acted as an interface. This interface basically wrapped all api calls that was made to third party dll. The header file for this interface did not include any header file from third party dll rather all those header files were included in the interface.cpp file. Interface as you expect was built as MD.
Now In my main.cpp file I included this interface header file to make all the calls to third party dll through the interface.
Extra care has to be taken in passing arguments to the interface. Basic variables like int,bool etc can be passed as value. However any class or structure needs to be passed as const reference to avoid heap corruption. This is applicable to even string.
Happy to share more details if it is not clear!

Runtime Library mis-matches and VC++ - Oh, the misery!

It seems that all my adult life I've been tormented by the VC++ linker complaining or balking because various libraries do not agree on which version of the Runtime library to use. I'm never in the mood to master that dismal subject. So I just try to mess with it until it works. The error messages are never useful. Neither is the Microsoft documentation on the subject - not to me at least.
Sometimes it does not find functions - because the name-mangling is not what was expected? Sometimes it refuses to mix-and-match. Other times it just says, "LINK : warning LNK4098: defaultlib 'LIBCMTD' conflicts with use of other libs; use /NODEFAULTLIB:library" Using /NODEFAULTLIB does not work, but the warning seems to be benign. What the heck is "DEFAULTLIB" anyway? How does the linker decide? I've never seen a way to specify to the linker which runtime library to use, only how to tell the compiler which library to create function calls for.
There are "dependency walker" programs that can inspect object files to see what DLL's they depend on. I just ran one on a project I'm trying to build, and it's a real mess. There are system .libs and .dll's that want conflicting runtime versions. For example, COMCTL32.DLL wants MSVCRT.DLL, but I am linking with MSVCRTD.DLL. I am searching to see if there's a COMCTL32D.DLL, even as I type.
So I guess what I'm asking for is a tutorial on how to sort those things out. What do you do, and how do you do it?
Here's what I think I know. Please correct me if any of this is wrong.
The parameters are Debug/Release, Multi-threaded/Single-threaded, and static/DLL. Only six of the eight possible combinations are covered. There is no single-threaded DLL, either Debug or Release.
The settings only affect which runtime library gets linked in (and the calling convention to link with it). You do not, for example, have to use a DLL-based runtime if you are building a DLL, nor do you have to use a Debug version of runtime when building the Debug version of a program, although it seems to help when single-stepping past system calls.
Bonus question: How could anyone or any company create such a mess?
Your points (1) and (2) look correct to me. Another thing to note with (2) is that linking in the debug CRT also gives you access to things like enhanced heap checking, checked iterators, and other assorted sanity checks. You cannot redistribute the debug CRT with your application, however -- you must ship using the release build only. Not only is it required by the VC license, but you probably don't want to be shipping debug binaries anyway.
There is no such thing as COMCTL32D.DLL. DLLs that are part of Windows must load the CRT that they were linked against when Windows was built -- this is included with the OS as MSVCRT.DLL. This Windows CRT is completely independent from the Visual C++ CRT that is loaded by the modules that comprise your program (MSVCRT.DLL is the one that ships with Windows. The VC CRT will include a version number, for example MSVCRT80.DLL). Only the EXE and DLL files that make up your program are affected by the debug/release multithreaded/single-threaded settings.
The best practice here IMO is to pick a setting for your CRT and standardize upon it for every binary that you ship. I'd personally use the multithreaded DLL runtime. This is because Microsoft can (and does) issue security updates and bug fixes to the CRT that can be pushed out via Windows Update.

Are there any tools for checking symbols in cross compiled .so files?

I've got an application that loads .so files as plugins at startup, using dlopen()
The build environment is running on x86 hardware, but the application is being cross compiled for another platform.
It would be great if I could (as part of the automated build process) do a check to make sure that there aren't any unresolved symbols in a combination of the .so files and the application, without having to actually deploy the application.
Before I write a script to test symbols using the output of nm, I'm wondering if anyone knows of a utility that already does this?
edit 1: changed the description slightly - I'm not just trying to test symbols in the .so, but rather in a combination of several .so's and the application itself - ie. after the application loaded all of the .so's whether there would still be unresolved symbols.
As has been suggested in answers (thanks Martin v. Löwis and tgamblin), nm will easily identify missing symbols in a single file but won't easily identify which of those symbols has been resolved in one of the other loaded modules.
Ideally, a cross-nm tool is part of your cross-compiler suite. For example, if you build GNU binutils for cross-compilation, a cross-nm will be provided as well (along with a cross-objdump).
Could you use a recursive version of ldd for this? Someone seems to have written a script that might help. This at least tell you that all the dependency libs could be resolved, if they were specified in the .so correctly in the first place. You can guarantee that all the dependencies are referenced in the .so with linker options, and this plus recursive ldd would guarantee you no unresolved symbols.
Linkers will often have an option to make unresolved symbols in shared libraries an error, and you could use this to avoid having to check at all. For GNU ld you can just pass --no-allow-shlib-undefined and you're guaranteed that if it makes a .so, it won't have unresolved symbols. From the GNU ld docs:
--no-undefined
Report unresolved symbol references from regular object files.
This is done even if the linker is creating a non-symbolic shared
library. The switch --[no-]allow-shlib-undefined controls the
behaviour for reporting unresolved references found in shared
libraries being linked in.
--allow-shlib-undefined
--no-allow-shlib-undefined
Allows (the default) or disallows undefined symbols in shared
libraries. This switch is similar to --no-undefined except
that it determines the behaviour when the undefined symbols are
in a shared library rather than a regular object file. It does
not affect how undefined symbols in regular object files are
handled.
The reason that --allow-shlib-undefined is the default is that the
shared library being specified at link time may not be the
same as the one that is available at load time, so the symbols might
actually be resolvable at load time. Plus there are some systems,
(eg BeOS) where undefined symbols in shared libraries is normal.
(The kernel patches them at load time to select which function is most
appropriate for the current architecture. This is used for example to
dynamically select an appropriate memset function). Apparently it is
also normal for HPPA shared libraries to have undefined symbols.
If you are going to go with a post-link check, I agree with Martin that nm is probably your best bet. I usually just grep for ' U ' in the output to check for unresolved symbols, so I think it would be a pretty simple script to write.
The restrictions in nm turned out to mean that it wasn't possible to use for a comprehensive symbol checker.
In particular, nm would only list exported symbols.
However, readelf will produce a comprehensive list, along with all of the library dependencies.
Using readelf it was possible to build up a script that would:
Create a list of all of the libraries used,
Build up a list of symbols in an executable (or .so)
Build up a list of unresolved symbols - if there are any unresolved symbols at this point, there would have been an error at load time.
This is then repeated until no new libraries are found.
If this is done for the executable and all of the dlopen()ed .so files it will give a good check on unresolved dependencies that would be encountered at run time.

Loading multiple shared libraries with different versions

I have an executable on Linux that loads libfoo.so.1 (that's a SONAME) as one of its dependencies (via another shared library). It also links to another system library, which, in turn, links to a system version, libfoo.so.2. As a result, both libfoo.so.1 and libfoo.so.2 are loaded during execution, and code that was supposed to call functions from library with version 1 ends up calling (binary-incompatible) functions from a newer system library with version 2, because some symbols stay the same. The result is usually stack smashing and a subsequent segfault.
Now, the library which links against the older version is a closed-source third-party library, and I can't control what version of libfoo it compiles against. Assuming that, the only other option left is rebuilding a bunch of system libraries that currently link with libfoo.so.2 to link with libfoo.so.1.
Is there any way to avoid replacing system libraries wiith local copies that link to older libfoo? Can I load both libraries and have the code calling correct version of symbols? So I need some special symbol-level versioning?
You may be able to do some version script tricks:
http://sunsite.ualberta.ca/Documentation/Gnu/binutils-2.9.1/html_node/ld_26.html
This may require that you write a wrapper around your lib that pulls in libfoo.so.1 that exports some symbols explicitly and masks all others as local. For example:
MYSYMS {
global:
foo1;
foo2;
local:
*;
};
and use this when you link that wrapper like:
gcc -shared -Wl,--version-script,mysyms.map -o mylib wrapper.o -lfoo -L/path/to/foo.so.1
This should make libfoo.so.1's symbols local to the wrapper and not available to the main exe.
I can only come up with a work-around. Which would be to statically link a version of the "system library" that you are using. For your static build, you could make it link against the same old version as the third-party library. Given that it does not rely on the newer version...
Perhaps it is also possible to avoid these problems with not linking to the third-party library the ordinary way. Instead, your program could load it at execution time. Perhaps then it could be shadowed against the rest. But I don't know much about that.

Resources