Why does ucrtbase export _CxxThrowException? - visual-c++

Why do ucrtbase.dll and vcruntime140.dll overlap in some of the functions they export according to Dependency Walker?
Disclaimer: This is currently purely of academic interest to me.
I'm currently trying to understand the layout of the Microsoft Visual-C++ CRT related DLL files. Find Info un the UCRT and the files in general here:
Introducing the Universal CRT
CRT Library Features
In short, you have these (toplevel) DLL dependencies at runtime for a normal C++ App:
ucrtbase.dll - "compiler independent" stuff
vcruntime<ver>.dll - "compiler dependent" stuff
msvcp<ver>.dll - C++ standard library
What can be highlighted from this info is:
From the blog entry:
... split the CRT into two logical parts: The VCRuntime, which contained
the compiler support functionality required for things like process
startup and exception handling ...
and from the MSDN page:
The vcruntime library contains Visual C++ CRT implementation-specific
code, such as exception handling and debugging support, runtime checks
and type information, implementation details and certain extended
library functions. This library is specific to the version of the
compiler used.
While browsing the DLLs with Dependency Walker, I noticed that both the ucrt and the vcruntime export the function _CxxThrowException. This function is an old acquaintance if you've ever been looking at vc++ stack traces:
Builds the exception record and calls the runtime environment to start processing the exception.
I am quite surprised to find this exported from the ucrtbase.dll as - as both quotes above indicate - I'd have thought this machinery to firmly belong to the compiler specific side of things.
While writing this up, I've noticed some other overlaps: A very few of the standard C library functions (memcpy, ..., strstr, ...) are also exported from vcruntime140.dll although I'd have expected them to only live in ucrtbase.
So what is going on here and what can I learn from this?

The Universal CRT (ucrtbase.dll) contains a private copy of the VCRuntime, for use by Windows operating system components. This private copy of the VCRuntime is an internal implementation detail of the operating system and may change at any time (i.e., there is no application compatibility guarantee whatsoever.
Do not under any circumstances use these exports from the Universal CRT. (No library in the Windows SDK provides linkable symbols for these exports, so it's impossible to accidentally use them.)

Related

Use C++ DLLs from the same VS compiled at different times/teams - ABI compatibility?

To repeat: I'm looking for ABI compatibility between libraries of the same Visual-C++ version!
We want to mix and match some internal C++ DLLs from different teams - built at different times with different project files. Because of long build times, we exactly want to avoid large monolithic builds where each team re-compiles the source code of another team's library.
When consuming C++ DLLs with C++ interfaces it is rather clear that you only can do this if all DLLs are compiled with the same compiler / Visual Studio version.
What is not readily apparent to me is what, exactly needs to be the same to get ABI compatibility.
Obviously debug (_DEBUG) and release (NDEBUG) cannot be mixed -- but that's also apparent from the fact that these link to different versions of the shared runtime.
Do you need the exact same compiler version, or is it sufficient that the resulting DLL links to the same shared C++ runtime -- that is, basically to the same redistributable? (I think static doesn't fly when passing full C++ objects around)
Is there a documented list of compiler (and linker) options that need to be the same for two C++ DLLs of the same vc++ version to be compatible?
For example, is the same /O switch necessary - does the optimization level affect ABI compatibility´? (I'm pretty sure not.)
Or do both version have to use the same /EH switch?
Or /volatile:ms|iso ... ?
Essentially, I'd like to come up with a set of (meta-)data to associate with a Visual-C++ DLL that describes it's ABI compatibility.
If differences exist, my focus is on VS2015 only at the moment.
Have been thinking this through the last days, and what I did do was to try to see if some use-cases exists where devs have already needed to categorize their C++ build to make sure binaries are compatible.
One such place is the Native Packages from nuget. So I looked at one package there, specifically the cpprestsdk:
The binaries in the downloadable package as split like this:
native\v120\windesktop\msvcstl\dyn\rt-dyn\x64\Release\
^ ^ ^ ^ ^
VS version | not sure | uses cpp-runtime dynamically
| lib itself dynamic (as opposed to static)
or WinXP or WinApp(WinRT?)
I pulled this out from this example, because I couldn't find any other docs. I also know that the boost binaries build directory is separated in a similar way.
So, to get to a list of meta data to identify the ABI compatibility, I can preliminarily list the following:
VC version (that is, the version of the C and CPP runtime libraries used)
one point here is that e.g. vc140 should be enough nowadays - given how the CRT is linked in, all possible bugfixes to the versioned CRT components must be ABI compatible anyway, so it shouldn't matter which version a given precompiled library was built with.
pure native | managed (/CLI) | WinRT
how the CRT is consumed (statically / dynamically)
bitness / platform (Win32, x64, ARM, etc.)
Release or Debug version (i.e. which version of the CRT we link to)
plus: _ITERATOR_DEBUG_LEVEL ... if everyone goes with the defaults, fine, if a project does not, it must declare so
Additionally my best guess as to the following items:
/O must not matter - we constantly mix&match binaries with different optimization settings - specifically, this is even working for object files within the same binary
/volatile - since this is a code-gen thing, I have a hard time imagining how this could break an ABI
/EH - except for the option to disable all exception, in which case you obviously can't call anything that throws, I'm pretty confident this is save from an ABI perspective: There are possible pitfalls here, but I think they can't really be categorized into ABI compat. (Maybe some complex callback chains could be said to be ABI incompatible, not sure)
Others:
Default calling convention (/G..) : I think this would break at link time, when mangled export symbols and header declarations don't match up.
/Zc:wchar_t - will break at link time (It's actually ABI compatible, but the symbols won't macth.)
Enable RTTI (/GR) - not too sure 'bout this one - I never have worked with this disabled.

C++/CLI Wrapper DLL TypeLoadException too many fields

I have native (unmanaged) C++ DLLs that are being wrapped by a single C++/CLI dll (linking through the .lib files). These unmanaged C++ DLLs have quite a few classes with a ton of methods and a ton const data (e.g. strings, hex values, etc) which are defined in included headers.
But for the C++/CLI wrapper DLL its only a wrapping and marshalling layer for the native dll. However its binary size is as big as the native dll.
I believe this is causing me to hit the hardcoded limit which throws the exception when it is being loaded by a C# application:
System.TypeLoadException: Internal limitation: too many fields
The C# application will never use the fields defined in the headers for the native DLLs.
It was able to alleviate this issue through enabling of string pooling (shaving off a few MB), but it seems like a hack.
Why is a simple wrapper of a DLL the same size as that DLL? Is there a way where I can mark the const data such that the C# application won't load them?
You are falling into a pretty common trap, the C++/CLI compiler works too well. It is capable of compiling any C++03 compatible native C++ code into IL when #pragma managed or /clr is in effect. Works well at runtime too, it gets just-in-time compiled by the jitter to machine code, just like regular managed programs will be.
That's the good news. The bad news is that this code does not execute like managed code. It doesn't get verified and it doesn't get the garbage collector love. It also doesn't run as efficiently as regularly compiled C++ code, you are missing out on the extra time that the C++ code optimizer has available get the absolute best possible machine code.
And the one restriction that made your program bomb. Any global variables and free functions are compiled into members of the hidden <Module> class. Required because the CLR doesn't support globals. Members of a managed class get a metadata token, a number that uniquely identifies them in the metadata tables. A token is a 32-bit value with the low 16-bits used to number them. Kaboom when you created a <Module> class with more than 65535 members.
Clearly this is all quite undesirable. You'll need to pay more attention to what code gets compiled to IL and what code gets compiled to machine code. Your native C++ source code should be compiled without the /clr option in effect. Shift+click select those files and set the option. Where necessary, use #pragma un/managed to switch the compiler back and forth within one source code file.
Why is a simple wrapper of a DLL the same size as that DLL? Is there a way where I can mark the const data such that the C# application won't load them?
This is typically because you're compiling the entire project with /CLR.
If you're very careful to only include the absolute bare minimum requirements into the .cpp files which are compiled with /CLR, and only compile the .cpp files which are managed classes with /CLR, the wrapper projects tend to be far simpler and smaller. The main issue is that any header that's used by a /CLR compiled .cpp file creates proxy types for all of the C++ types, which can explode into a huge number of fields or types in the assembly.
Using the PIMPL idiom to "hide" the native code behind and opaque pointer can also dramatically shrink the number of types exposed to the managed portion of the assembly, as this allows you to not include the main headers within the managed code.

Combining C++/CLI, x86, x64 and strong naming

Let me get right to the point:
Main application:
C# (4.0), AnyCPU.
Library:
Wrapper for native .dll written in C++/CLI. Compiled in two versions; x86 and x64, both signed with the same .snk key (using this workaround)
Limitations:
In the end a single distribution package is required for x86 and x64 platforms.
Main application needs strong name due to references to other strongly named libs.
Rewriting the library using managed C# and P/Invoke is an absolute last way out.
The problem:
As long as the main application, at compile time, references the version (x86 or x64) of the library that is needed when run, this is all working fine.
Moving the same compiled output - and exchanging the library with the right platform version during installation - does not work since the signature of the library changes from that of the referenced one.
In a test application without any strong naming I can switch between them as needed.
The question:
Is there a way to enable switching between the x86 and x64 libraries within the set limitations, or is strong naming preventing any possible solution other than rewriting the lib?
Let me clarify that it is not a question about finding the correct .dll (as discussed here) but about being able to load the .dll once found.
#Damien_The_Unbeliever's comment got me thinking and he is right in that the strong names are the same, and it was not the actual issue.
I found another difference between the two versions of the library; the output name was set to nnn.dll and nnnx64.dll. Changing it so that both have the same output name magically made it all work.
Perhaps someone knows why such a setting matters, I certainly don't.

Importance of compiling single-threaded v. multi-threaded (and lib naming conventions)?

[ EDIT ] ==>
To clarify, in those environments where multiple targets are deployed to the same directory, Planet Earth has decided on a convention to append "d" or "_d" or "_debug" to the "DEBUG" version (of a library or executable). Such a convention can be considered "ubiquitous" and "understood", although (of course) not everybody does this.
SIMILARLY, to resolve ambiguity between "shared" and "static" versions of a library, a common convention is to append something to distinguish between the static-and-shared (like "myfile.lib" for shared-import-lib-on-Windows and "myfile_s.lib" for static-import-lib-on-Windows). While Posix does not have this ambiguity based on file extension, remember that the file extension is not used on the "link line", so it is similarly useful to be able to explicitly specify the "static" or "shared" version of a library.
For the purpose of this question, both "debug/release" and "static/shared" are promoted to "ubiquitous convention to decorate the file name root".
QUESTION: Does any other deployment configuration get "promoted" to this level of "ubiquitous convention" such that it would become explicit in the file target root name?
My current guess is "no". For the answer to be "Yes", it would require: More than one configuration for given target is intended to be "used" (and thus deployed to a common directory, which is the assumed basis for the question).
In the past, we compiled with-and-without "web plug-in" capability, which similarly required that name decoration, but we no longer build those targets (so I won't assert that as an example). Similarly, we sometimes compile with-and-without multi-byte character support, but I hate that, so I won't assert that either.
[ORIGINAL QUESTION]
We're establishing library naming conventions/policy, to be applied across languages and platforms (e.g., we support hybrid products using several languages on different platforms, including C/C++, C#, Java). A particular goal is to ensure we handle targets/resources for mobile development (which is new to us) in addition to our traditional desktop (and embedded) applications.
Of course, one option is to have different paths for targets from different build configurations. For the purpose of this question, the decision is made to have all targets co-locate to a single directory, and to "decorate" the library/resource/executable name to avoid collisions based on build configuration (e.g., "DEBUG" v. "RELEASE", "static lib" v. "shared/DLL", etc.)
Current decision is similar to others on the web, where we append tokens to avoid naming collisions:
MyName.lib (release build, import for shared/dll)
MyName_s.lib (release build, static lib)
MyName_d.lib (debug build, import for shared/DLL)
MyName_ud.lib (Unicode/wide-char, debug, import for shared/DLL)
MyName_usd.lib (Unicode/wide-char, static lib, debug)
(The above are Windows examples, but these policies similarly apply to our POSIX systems.)
These are based on:
d (release or debug)
u (ASCII or Unicode/wide-char)
s (shared/DLL or static-lib)
QUESTION: We do not have legacy applications that must be compiled single-threaded, and my understanding is that (unlike Microsoft) POSIX systems can link single- and multi-threaded targets into a single application without issue. Given today's push towards multi-core and multi-threaded, Is there a need in a large enterprise to establish the following to identify "single-" versus "multi-threaded" compiled targets?
t (single-threaded or multi-threaded) *(??needed??)*
...and did we miss any other target collision, like compile with-and-without STL (on C++)?
As an aside, Microsoft has library naming conventions at:
http://msdn.microsoft.com/en-us/library/aa270400(v=vs.60).aspx and their DLL naming conventions at: http://msdn.microsoft.com/en-us/library/aa270964(v=vs.60).aspx
A similar question on SO a year ago that didn't talk about threading and didn't reference the Microsoft conventions can be found at: What is proper naming convention for MSVC dlls, static libraries and import libraries
You are using an ancient compiler. There is no need to establish such a standard in an enterprise, the vendor has already done this. Microsoft hasn't shipped a single-threaded version of the CRT for the past 13 years. Similarly, Windows has been a Unicode operating system for the past 17 years. It makes zero sense to still write Unicode agnostic code these days.
But yes, the common convention is to append a "d" for the debug build of a library. And to give a DLL version of a library a completely different name.

Runtime Library mis-matches and VC++ - Oh, the misery!

It seems that all my adult life I've been tormented by the VC++ linker complaining or balking because various libraries do not agree on which version of the Runtime library to use. I'm never in the mood to master that dismal subject. So I just try to mess with it until it works. The error messages are never useful. Neither is the Microsoft documentation on the subject - not to me at least.
Sometimes it does not find functions - because the name-mangling is not what was expected? Sometimes it refuses to mix-and-match. Other times it just says, "LINK : warning LNK4098: defaultlib 'LIBCMTD' conflicts with use of other libs; use /NODEFAULTLIB:library" Using /NODEFAULTLIB does not work, but the warning seems to be benign. What the heck is "DEFAULTLIB" anyway? How does the linker decide? I've never seen a way to specify to the linker which runtime library to use, only how to tell the compiler which library to create function calls for.
There are "dependency walker" programs that can inspect object files to see what DLL's they depend on. I just ran one on a project I'm trying to build, and it's a real mess. There are system .libs and .dll's that want conflicting runtime versions. For example, COMCTL32.DLL wants MSVCRT.DLL, but I am linking with MSVCRTD.DLL. I am searching to see if there's a COMCTL32D.DLL, even as I type.
So I guess what I'm asking for is a tutorial on how to sort those things out. What do you do, and how do you do it?
Here's what I think I know. Please correct me if any of this is wrong.
The parameters are Debug/Release, Multi-threaded/Single-threaded, and static/DLL. Only six of the eight possible combinations are covered. There is no single-threaded DLL, either Debug or Release.
The settings only affect which runtime library gets linked in (and the calling convention to link with it). You do not, for example, have to use a DLL-based runtime if you are building a DLL, nor do you have to use a Debug version of runtime when building the Debug version of a program, although it seems to help when single-stepping past system calls.
Bonus question: How could anyone or any company create such a mess?
Your points (1) and (2) look correct to me. Another thing to note with (2) is that linking in the debug CRT also gives you access to things like enhanced heap checking, checked iterators, and other assorted sanity checks. You cannot redistribute the debug CRT with your application, however -- you must ship using the release build only. Not only is it required by the VC license, but you probably don't want to be shipping debug binaries anyway.
There is no such thing as COMCTL32D.DLL. DLLs that are part of Windows must load the CRT that they were linked against when Windows was built -- this is included with the OS as MSVCRT.DLL. This Windows CRT is completely independent from the Visual C++ CRT that is loaded by the modules that comprise your program (MSVCRT.DLL is the one that ships with Windows. The VC CRT will include a version number, for example MSVCRT80.DLL). Only the EXE and DLL files that make up your program are affected by the debug/release multithreaded/single-threaded settings.
The best practice here IMO is to pick a setting for your CRT and standardize upon it for every binary that you ship. I'd personally use the multithreaded DLL runtime. This is because Microsoft can (and does) issue security updates and bug fixes to the CRT that can be pushed out via Windows Update.

Resources