C++/CLI Wrapper DLL TypeLoadException too many fields - visual-c++

I have native (unmanaged) C++ DLLs that are being wrapped by a single C++/CLI dll (linking through the .lib files). These unmanaged C++ DLLs have quite a few classes with a ton of methods and a ton const data (e.g. strings, hex values, etc) which are defined in included headers.
But for the C++/CLI wrapper DLL its only a wrapping and marshalling layer for the native dll. However its binary size is as big as the native dll.
I believe this is causing me to hit the hardcoded limit which throws the exception when it is being loaded by a C# application:
System.TypeLoadException: Internal limitation: too many fields
The C# application will never use the fields defined in the headers for the native DLLs.
It was able to alleviate this issue through enabling of string pooling (shaving off a few MB), but it seems like a hack.
Why is a simple wrapper of a DLL the same size as that DLL? Is there a way where I can mark the const data such that the C# application won't load them?

You are falling into a pretty common trap, the C++/CLI compiler works too well. It is capable of compiling any C++03 compatible native C++ code into IL when #pragma managed or /clr is in effect. Works well at runtime too, it gets just-in-time compiled by the jitter to machine code, just like regular managed programs will be.
That's the good news. The bad news is that this code does not execute like managed code. It doesn't get verified and it doesn't get the garbage collector love. It also doesn't run as efficiently as regularly compiled C++ code, you are missing out on the extra time that the C++ code optimizer has available get the absolute best possible machine code.
And the one restriction that made your program bomb. Any global variables and free functions are compiled into members of the hidden <Module> class. Required because the CLR doesn't support globals. Members of a managed class get a metadata token, a number that uniquely identifies them in the metadata tables. A token is a 32-bit value with the low 16-bits used to number them. Kaboom when you created a <Module> class with more than 65535 members.
Clearly this is all quite undesirable. You'll need to pay more attention to what code gets compiled to IL and what code gets compiled to machine code. Your native C++ source code should be compiled without the /clr option in effect. Shift+click select those files and set the option. Where necessary, use #pragma un/managed to switch the compiler back and forth within one source code file.

Why is a simple wrapper of a DLL the same size as that DLL? Is there a way where I can mark the const data such that the C# application won't load them?
This is typically because you're compiling the entire project with /CLR.
If you're very careful to only include the absolute bare minimum requirements into the .cpp files which are compiled with /CLR, and only compile the .cpp files which are managed classes with /CLR, the wrapper projects tend to be far simpler and smaller. The main issue is that any header that's used by a /CLR compiled .cpp file creates proxy types for all of the C++ types, which can explode into a huge number of fields or types in the assembly.
Using the PIMPL idiom to "hide" the native code behind and opaque pointer can also dramatically shrink the number of types exposed to the managed portion of the assembly, as this allows you to not include the main headers within the managed code.

Related

Why does ucrtbase export _CxxThrowException?

Why do ucrtbase.dll and vcruntime140.dll overlap in some of the functions they export according to Dependency Walker?
Disclaimer: This is currently purely of academic interest to me.
I'm currently trying to understand the layout of the Microsoft Visual-C++ CRT related DLL files. Find Info un the UCRT and the files in general here:
Introducing the Universal CRT
CRT Library Features
In short, you have these (toplevel) DLL dependencies at runtime for a normal C++ App:
ucrtbase.dll - "compiler independent" stuff
vcruntime<ver>.dll - "compiler dependent" stuff
msvcp<ver>.dll - C++ standard library
What can be highlighted from this info is:
From the blog entry:
... split the CRT into two logical parts: The VCRuntime, which contained
the compiler support functionality required for things like process
startup and exception handling ...
and from the MSDN page:
The vcruntime library contains Visual C++ CRT implementation-specific
code, such as exception handling and debugging support, runtime checks
and type information, implementation details and certain extended
library functions. This library is specific to the version of the
compiler used.
While browsing the DLLs with Dependency Walker, I noticed that both the ucrt and the vcruntime export the function _CxxThrowException. This function is an old acquaintance if you've ever been looking at vc++ stack traces:
Builds the exception record and calls the runtime environment to start processing the exception.
I am quite surprised to find this exported from the ucrtbase.dll as - as both quotes above indicate - I'd have thought this machinery to firmly belong to the compiler specific side of things.
While writing this up, I've noticed some other overlaps: A very few of the standard C library functions (memcpy, ..., strstr, ...) are also exported from vcruntime140.dll although I'd have expected them to only live in ucrtbase.
So what is going on here and what can I learn from this?
The Universal CRT (ucrtbase.dll) contains a private copy of the VCRuntime, for use by Windows operating system components. This private copy of the VCRuntime is an internal implementation detail of the operating system and may change at any time (i.e., there is no application compatibility guarantee whatsoever.
Do not under any circumstances use these exports from the Universal CRT. (No library in the Windows SDK provides linkable symbols for these exports, so it's impossible to accidentally use them.)

16-bit obj files VC++

How do I compile my VC++ project to a 16-bit flat object file for use in my bootloader I am working on?
To my understanding, an object file is technically already "flat" and the linker turns it into the destination executable format. What I want it to be able to obtain that object file and pass that and my assembly code (in obj format) through the linker to create a flat bootloader.
The [guide][1] is not very specific on where the files are located and just says that you use cl.exe, link.exe, and ml.exe (MASM).
The guide uses MASM, but I know how to output object files with NASM. My main problem is the VC++ thing.
The last 16-bit compiler from Microsoft was VC++ 1.52c. It's ancient, and probably not available any more. Even if it was, chances are pretty good that it wouldn't compile any recent code. Just to name a few of its most obvious shortcomings, it had no support for templates, exception handling, or namespaces at all.
I believe most people working on things like that any more use Open Watcom (which isn't exactly up to date either, but still better than VC++ 1.52c).

How to make a fix in one of the shared libraries (.so) in the project on linux?

I want to make a quick fix to one of the project's .so libraries. Is it safe to just recompile the .so and replace the original? Or I have to rebuild and reinstall the whole project? Or it depends?
It depends. Shared library needs to be binary-compatible with your executable.
For example,
if you changed the behaviour of one of library's internal functions, you probably don't need to recompile.
If you changed the size of a struct (e.g. by adding a member) that's known by the application, you will need to recompile, otherwise the library and the application will think the struct is smaller than it is, and will crash when the library tries to read an extra uninitialized member that the application didn't write to.
If you change the type or the position of arguments of any functions visible from the applications, you do need to recompile, because the library will try to read more arguments off the stack than the application has put on it (this is the case with C, in C++ argument types are the part of function signature, so the app will refuse run, rather than crashing).
The rule of thumb (for production releases) is that, if you are not consciously aware that you are maintaining binary compatibility, or not sure what binary compatibility is, you should recompile.
That's certainly the intent of using dynamic libraries: if something in the library needs updating, then you just update the library, and programs that use it don't need to be changed. If the signature of the function you're changing doesn't change, and it accomplishes the same thing, then this will in general be fine.
There are of course always edge cases where a program depends on some undocumented side-effect of a function, and then changing that function's implementation might change the side-effect and break the program; but c'est la vie.
If you have not changed the ABI of the shared library, you can just rebuild and replace the library.
It depends yes.
However, I assume you have the exact same source and compiler that built the other stuff and now if you only change in a .cpp file something, it is fine.
Other things e.g. changing an interface (between the shared lib and the rest of the system) in a header file is not fine.
If you don't change your library binary interface, it's ok to recompile and redeploy only the shared library.
Good references:
How To Write Shared Libraries
The Little Manual of API Design

mixing code compiled with /MT and /MD

I have a large body of code, compiled with /MT (i.e. expecting to statically link against the CRT). I need to combine this with a static third-party library, which has been built with /MD (i.e. expecting to link the CRT dynamically).
Is it theoretically possible to link the two into one executable without recompiling either?
If I link with /nodefaultlib:msvcrt, I end up with a small number of undefined references to things like __imp__wgetenv. I'm tempted to try implementing those functions in my own code, forwarding to wgetenv, etc. Is that worth trying, or will I run straight into the next problem?
Unfortunately I'm Forbidden from taking the easy option of packing the thirdparty code into a separate DLL :-/
No. /MT and /MD are mutually exclusive.
All modules passed to a given invocation of the linker must have been compiled with the same run-time library compiler option (/MD, /MT, /LD).
Source
I found such solution in OpenSSL sources: All obj files of the library are compiled with combination: /MT /Zl. As author described, such combination allows to build static library with ability to compile with applications either dynamic CRT (/MD) or static CRT (/MT).
I faced similar situation where in I had two libraries one was built with MT and another one with MD. I had to build an executable which uses functionalities from both the libraries. The library built as MD was third party thus I couldn't rebuilt it and library built as MT has many dependencies and to built all of them as MD is a big pain. I was getting error from the third party config header file which made it mandatory to built the executable as MD. I was looking for the easy way of packaging third party dll as a separate dll as mentioned in question. However, I couldn't find enough explanation online on this easy way. Hence my two cents below.
The following is the way I circumvent it
I built another .dll which acted as an interface. This interface basically wrapped all api calls that was made to third party dll. The header file for this interface did not include any header file from third party dll rather all those header files were included in the interface.cpp file. Interface as you expect was built as MD.
Now In my main.cpp file I included this interface header file to make all the calls to third party dll through the interface.
Extra care has to be taken in passing arguments to the interface. Basic variables like int,bool etc can be passed as value. However any class or structure needs to be passed as const reference to avoid heap corruption. This is applicable to even string.
Happy to share more details if it is not clear!

What are the porting issues going from VC8 (VS2005) to VC9 (VS2008)?

I have inherited a very large and complex project (actually, a 'solution' consisting of 119 'projects', most of which are DLLs) that was built and tested under VC8 (VS2005), and I have the task of porting it to VC9 (VS2008).
The porting process I used was:
Copy the VC8 .sln file and rename it
to a VC9 .sln file.
Copy all of
the VC8 project files, and rename
them to VC9 project files.
Edit
all of the VC9 project files,
s/vc8/vc9.
Edit the VC9 .sln,
s/vc8/vc9/
Load the VC9 .sln with
VS2008, and let the IDE 'convert'
all of the project files.
Fix
compiler and linker errors until I
got a good build.
So far, I have run into the following issues in that last step.
1) A change in the way decorated names are calculated, causing truncation of the names.
This is more than just a warning (http://msdn.microsoft.com/en-us/library/074af4b6.aspx). Libraries built with this warning will not link with other modules. Applying the solution given in MSDN was non-trivial, but doable. I addressed this problem separately in How do I increase the allowed decorated name length in VC9 (MSVC 2008)?
2) A change that does not allow the assignment of zero to an iterator. This is per the spec, and it was fairly easy to find and fix these previously-allowed coding errors. Instead of assignment of zero to an iterator, use the value end().
3) for-loop scope is now per the ANSI standard. Another easy-to-fix problem.
4) More space required for pre-compiled headers. In some cases a LOT more space was required. I ended up using /Zm999 to provide the maximum PCH space. If PCH memory usage gets bumped up again, I assume that I will have to forgo PCH altogether, and just endure the increase in what is already a very long build time.
5) A change in requirements for copy ctors and default dtors. It appears that in template classes, under certain conditions that I haven't quite figured out yet, the compiler no longer generates a default ctor or a default dtor. I suspect this is a bug in VC9, but there may be something else that I'm doing wrong. If so, I'd sure like to know what it is.
6) The GUIDs in the sln and vcproj files were not changed. This does not appear to impact the build in any way that I can detect, but it is worrisome nevertheless.
Note that despite all of these issues, the project built, ran, and passed extensive QA testing under VC8. I have also back-ported all of the changes to the VC8 projects, where they still build and run just as happily as they did before (using VS2005/VC8). So, all of my changes required for a VC9 build at least appear to be backward-compatible, although the regression testing is still underway.
Now for the really hard problem: I have run into a difference in the startup sequence between VC8 and VC9 projects. The program uses a small-object allocator modeled after Loki, in Andrei Alexandrescu's Book Modern C++ Design. This allocator is initialized using a global variable defined in the main program module.
Under VC8, this global variable is constructed at the very beginning of the program startup, from code in a module crtexe.c. Under VC9, the first module that executes is crtdll.c, which indicates that the startup sequence has been changed. The DLLs that are starting up appear to be confusing the small-object allocator by allocating and deallocating memory before the global object can initialize the statistics, which leads to some spurious diagnostics. The operation of the program does not appear to be materially affected, but the QA folks will not allow the spurious diagnostics to get past them.
Is there some way to force the construction of a global object prior to loading DLLs?
What other porting issues am I likely to encounter?
Is there some way to force the construction of a global object prior to loading DLLs?
How about the DELAYLOAD option? So that DLLs aren't loaded until their first call?
That is a tough problem, mostly because you've inherited a design that's inherently dangerous because you're not supposed to rely on the initialization order of global variables.
It sounds like something you could try to work around by replacing the global variable with a singleton that other functions retrieve by calling a global function or method that returns a pointer to the singleton object. If the object exists at the time of the call, the function returns a pointer to it. Otherwise, it allocates a new one and returns a pointer to the newly allocated object.
The problem, of course, is that I can't think of a singleton implementation that would avoid the problem you're describing. Maybe this discussion would be useful: http://www.oneunified.net/blog/Personal/SoftwareDevelopment/CPP/Singleton.article
That's certainly an interesting problem. I don't have a solution other than perhaps to change the design so that there is no dependence on undefined behavior of the order or link/dll startup. Have you considered linking with the older linker? (or whatever the VS.NET term is)
Because the behavior of your variable and allocator relied on some (unknown at the time) arbitrary order of startup I would probably fix that so that it is not an issue in the future. I guess you are really asking if anyone knows how to do some voodoo in VC9 to make the problem disappear. I am interested in hearing it as well.
How about this,
Make your main program a DLL too, call it main.dll, linked to all the other ones, and export the main function as say, mainEntry(). Remove the global variable.
Create a new main exe which has the global variable and its initialization, but doesn't link statically to any of the other application DLLs (except for the allocator stuff).
This new main.exe then dynamically loads the main.dll using LoadLibrary(), then uses GetProcAddress to call mainEntry().
The solution to the problem turned out to be more straightforward than I originally thought. The initialization order problem was caused by the existence of several global variables of types derived from std container types (a basic design flaw that predated my position with that company). The solution was to replace all such globals with singletons. There were about 100 of them.
Once this was done, the initialization (and destruction) order was under programmer control.

Resources