How to avoid multiple copies of the code for std::vector<double>?

How to avoid multiple copies of the code for std::vector<double>? - linux

I've got one shared library -- let's call it the master. It produces one or more slave shared libraries. The slave shared libraries are interacting with the master via an interface, exchanging std::string, std::vector and others.
The compile time of the slave shared libraries must be minimized as this compile is done at the customer site dynamically.
As long as the exchanged object is not a STL container, everything works fine. e.g.
master compiles NonStlObject.cpp and NonStlObject.h and produces global text symbols (T)
client uses NonStlObject.h and creates undefined global text symbols (U)
As soon as an STL container is being exchanged, I end up with 1 + numberOfSlaves copies of the STL code -- and matching compile time -- they are weak symbols (W) in both master and slaves.
Is there any way to avoid this, other than wrapping every STL container?
PS. I don't care to get told, that the version of the compiler used for building the interacting shared libraries must be the same. Of course it must!
PPS. extern template seems to be ignored by the compiler when applied to std::vector

I end up with 1 + numberOfSlaves copies of the code
This is the least of your problems.
The much bigger problem is that it's really hard to achieve ABI compatibility in C++ across different versions of the compiler.
If you compile your code with g++-9.0, and your customer has g++-7.0 installed, chances are your code will not work at all.
You can of course ask the customer to install 9.0, but then they may not be able to build their other programs.
Or they could have g++-10.0, and then you can start seeing crashes if/when the std::string changes its ABI and is no longer compatible with your "master" library.
Is there any way to avoid this, other than wrapping every STL container?
Wrapping every STL class and passing it through a C interface definitely solves the "ABI incompatibility" problem, but I don't see how it solves the "I have N+1 copies of STL" problem (and I am not sure the latter problem needs solving in the first place).

Related

Gulp/Node: error while loading shared libraries: cannot allocate memory in static TLS block

Trying to run gulp and getting this output
$ gulp
node: error while loading shared libraries: cannot allocate memory in static TLS block
From what I have found, this seems to relate to gcc or g++, not sure how it pertains to node or gulp. Either way I can't seem to run gulp anymore. Should also mention, this just popped up today. It was running fine yesterday.
EDIT: seems like it's for all node commands. Just tried running npm -v to get the version number and it has the same output. Same with node -v
Running CentOS 6.9

The GNU toolchain supports various kinds of TLS, and one of them (the initial-exec model) involves what is essentially a fixed offset from the thread control block. At program startup, the dynamic linker computes all the offsets and makes sure that all threads have sufficient space for all the required thread local variables.
However, with dlopen, this does not work in general because it is not possible to move the thread control block around to make room for more thread-local variables. The current glibc dynamic linker has a heuristic which reserves some space for future dlopen calls, but if you load a number of shared objects, each wither their own thread-local variables, this is not enough.
The usual workaround is to use the LD_DEBUG=files environment variable (or strace) to find relevant shared objects loaded with dlopen (unfortunately, the error message you quoted does not provide this information). After that, you can use the LD_PRELOAD environment variable to tell the dynamic linker to load them early. (It is sufficient to do this for the shared object which is dlopened, its dependencies are processed automatically.) This has the side effect that the computation at program startup takes into account their TLS needs, and when the dlopen call happens later at run time, no additional TLS variables have to be allocated. However, this approach does not work for all shared objects because it affects symbol lookup and the order in which ELF constructors run.
In the general case, it may be necessary to switch some shared objects to the global-dynamic TLS model (which requires recompiling them), or use a glibc build with an increased TLS reserve. Unfortunately, the reserve cannot currently be set at run time.

I-7188ex's weird behaviour

I have quite complex I/O program (written by someone else) for controller ICPDAS i-7188ex and I am writing a library (.lib) for it that does some calculations based on data from that program.
Problem is, if I import function with only one line printf("123") and embed it inside I/O, program crashes at some point. Without imported function I/O works fine, same goes for imported function without I/O.
Maybe it is a memory issue but why should considerable memory be allocated for function which only outputs a string? Or I am completely wrong?
I am using Borland C++ 3.1. And yes, I can't use anything newer since controller takes only 80186 instruction set.

If your code is complex then sometimes your compiler can get stuck and compile it wrongly messing things up with unpredictable behavior. Happen to me many times when the code grows ... In such case usually swapping few lines of code (if you can without breaking functionality) or even adding few empty or rem lines inside code sometimes helps. Problem is to find the place where it do its thing. You can also divide your program into several files compile each separately to obj and then just link them to the final file ...
The error description remembers me of one I did fight with a long time. If you are using class/struct/template try this:
bds 2006 C hidden memory manager conflicts
may be it will help (did not test this for old turbo).
What do you mean by embed into I/O ? are you creating a sys driver file? If that is the case you need to make sure you are not messing with CPU registers. That could cause a lot of problems try to use
void some_function_or_whatever()
{
asm { pusha };
// here your code
printf("123");
asm { popa };
}
If you writing ISR handlers then you need to use interrupt keyword so compiler returns from it properly.
Without actual code and or MCVE is hard to point any specifics ...
If you can port this into BDS2006 or newer version (just for debug not really functional) then it will analyse your code more carefully and can detect a lot of hidden errors (was supprised when I ported from BCB series into BDS2006). Also there is CodeGuard option in the compiler which is ideal for finding such errors on runtime (but I fear you will not be able to run your lib without the I/O hw present in emulated DOS)

Do (statically linked) DLLs use a different heap than the main program?

I'm new to Windows programming and I've just "lost" two hours hunting a bug which everyone seems aware of: you cannot create an object on the heap in a DLL and destroy it in another DLL (or in the main program).
I'm almost sure that on Linux/Unix this is NOT the case (if it is, please say it, but I'm pretty sure I did that thousands of times without problems...).
At this point I have a couple of questions:
1) Do statically linked DLLs use a different heap than the main program?
2) Is the statically linked DLL mapped in the same process space of the main program? (I'm quite sure the answer here is a big YES otherwise it wouldn't make sense passing pointers from a function in the main program to a function in a DLL).
I'm talking about plain/regular DLL, not COM/ATL services
EDIT: By "statically linked" I mean that I don't use LoadLibrary to load the DLL but I link with the stub library

DLLs / exes will need to link to an implementation of C run time libraries.
In case of C Windows Runtime libraries, you have the option to specify, if you wish to link to the following:
Single-threaded C Run time library (Support for single threaded libraries have been discontinued now)
Multi-threaded DLL / Multi-threaded Debug DLL
Static Run time libraries.
Few More (You can check the link)
Each one of them will be referring to a different heap, so you are not allowed pass address obtained from heap of one runtime library to other.
Now, it depends on which C run time library the DLL which you are talking about has been linked to. Suppose let's say, the DLL which you are using has been linked to static C run time library and your application code (containing the main function) has linked to multi-threaded C Runtime DLL, then if you pass a pointer to memory allocated in the DLL to your main program and try to free it there or vice-versa, it can lead to undefined behaviour. So, the basic root cause are the C runtime libraries. Please choose them carefully.
Please find more info on the C run time libraries supported here & here
A quote from MSDN:
Caution Do not mix static and dynamic versions of the run-time libraries. Having more than one copy of the run-time libraries in a process can cause problems, because static data in one copy is not shared with the other copy. The linker prevents you from linking with both static and dynamic versions within one .exe file, but you can still end up with two (or more) copies of the run-time libraries. For example, a dynamic-link library linked with the static (non-DLL) versions of the run-time libraries can cause problems when used with an .exe file that was linked with the dynamic (DLL) version of the run-time libraries. (You should also avoid mixing the debug and non-debug versions of the libraries in one process.)

Let’s first understand heap allocation and stack on Windows OS wrt our applications/DLLs. Traditionally, the operating system and run-time libraries come with an implementation of the heap.
At the beginning of a process, the OS creates a default heap called Process heap. The Process heap is used for allocating blocks if no other heap is used.
Language run times also can create separate heaps within a process. (For example, C run time creates a heap of its own.)
Besides these dedicated heaps, the application program or one of the many loaded dynamic-link libraries (DLLs) may create and use separate heaps, called private heaps
These heap sits on top of the operating system's Virtual Memory Manager in all virtual memory systems.
Let’s discuss more about CRT and associated heaps:
C/C++ Run-time (CRT) allocator: Provides malloc() and free() as well as new and delete operators.
The CRT creates such an extra heap for all its allocations (the handle of this CRT heap is stored internally in the CRT library in a global variable called _crtheap) as part of its initialization.
CRT creates its own private heap, which resides on top of the Windows heap.
The Windows heap is a thin layer surrounding the Windows run-time allocator(NTDLL).
Windows run-time allocator interacts with Virtual Memory Allocator, which reserves and commits pages used by the OS.
Your DLL and exe link to multithreaded static CRT libraries. Each DLL and exe you create has a its own heap, i.e. _crtheap. The allocations and de-allocations has to happen from respective heap. That a dynamically allocated from DLL, cannot be de-allocated from executable and vice-versa.
What you can do? Compile our code in DLL and exe’s using /MD or /MDd to use the multithread-specific and DLL-specific version of the run-time library. Hence both DLL and exe are linked to the same C run time library and hence one _crtheap. Allocations are always paired with de-allocations within a single module.

If I have an application that compiles as an .exe and I want to use a library I can either statically link that library from a .lib file or dynamically linked that library from a .dll file.
Each linked module (ie. each .exe or .dll) will be linked to an implementation of the C or C++ run times. The run times themselves are a library that can be statically or dynamically linked to and come in different threading configurations.
By saying statically linked dlls are you describing a set up where an application .exe dynamically links to a library .dll and that library internally statically links to the runtime? I will assume that this is what you mean.
Also worth noting is that every module (.exe or .dll) has its own scope for statics i.e. a global static in an .exe will not be the same instance as a global static with the same name in a .dll.
In the general case therefore it cannot be assumed that lines of code running inside different modules are using the same implementation of the runtime, furthermore they will not be using the same instance of any static state.
Therefore certain rules need to be obeyed when dealing with objects or pointers that cross module boundaries. Allocations and deallocations must be occur in the same module for any given address. Otherwise the heaps will not match and behaviour will not be defined.
COM solves this this using reference counting, objects delete themselves when the reference count reaches zero. This is a common pattern used to solve the matched location problem.
Other problems can exist, for instance windows defines certain actions e.g. how allocation failures are handled on a per thread basis, not on a per module basis. This means that code running in module A on a thread setup by module B can also run into unexpected behaviour.

What functions does _WinMainCRTStartup perform?

This is part of a series of at least two closely related, but distinct questions. I hope I'm doing the right thing by asking them separately.
I'm trying to get my Visual C++ 2008 app to work without the C Runtime Library. It's a Win32 GUI app without MFC or other fancy stuff, just plain Windows API.
So I set Project Properties -> Configuration -> C/C++ -> Advanced -> Omit Default Library Names to Yes (compiler flag /Zl) and rebuilt.
Then the linker complains about an unresolved external _WinMainCRTStartup. Fair enough, I can tell the linker to use a different entry point, say MyStartup. From what I gather around the web, _WinMainCRTStartup does some initialization stuff, and I probably want MyStartup to do a subset of that.
So my question is: What functions does _WinMainCRTStartup perform, and which of these can I omit if I don't use the CRT?
If you are knowledgeable about this stuff, please have a look at my other question too. Thanks!
Aside: Why do I want to do this in the first place?
My app doesn't explicitly use any CRT functions.
I like lean and mean apps.
It'll teach me something new.

The CRT's entry point does the following (this list is not complete):
Initializes global state needed by the CRT. If this is not done, you cannot use any functions or state provided by the CRT.
Initializes some global state that is used by the compiler. Run-time checks such as the security cookie used by /GS definitely stands out here. You can call __security_init_cookie yourself, however. You may need to add other code for other run-time checks.
Calls constructors on C++ objects. If you are writing C++ code, you may need to emulate this.
Retrieves command line and start up information provided by the OS and passes it your main. By default, no parameters are passed to the entry point of the program by the OS - they are all provied by the CRT.
The CRT source code is available with Visual Studio and you can step through the CRT's entry point in a debugger and find out exactly what it is doing.

A true Win32 program written in C (not C++) doesn't need any initialization at all, so you can start your project with WinMainCRTStartup() instead of WinMain(HINSTANCE,...).
It's also possible but a bit harder to write console programs as true Win32 applications; the default name of entry point is _mainCRTStartup().
Disable all extra code generation features like stack probes, array checks etc. Debugging is still possible.
Initialization
Sometimes you need the first HINSTANCE parameter. For Win32 (except Win32s), it is fixed to (HINSTANCE)0x400000.
The nCmdShow parameter is always SW_SHOWDEFAULT.
If necessary, retrieve the command line with GetCommandLine().
Termination
When your program spawns threads, e.g. by calling GetOpenFileName(), returning from WinMainCRTStartup() with return keyword will hang your program — use ExitProcess() instead.
Caveats
You will run into considerable trouble when:
using stack frames (i.e. local variables) larger than 4 KBytes (per function)
using float-point arithmetic (e.g. float->int conversion)
using 64-bit integers on 32-bit machines (multiply, bit-shift operations)
using C++ new, delete, and static objects with non-zero-out-all-members constructors
using standard library functions like fopen(), printf() of course
Troubleshoot
There is a C standard library available on all Windows systems (since Windows 95), the MSVCRT.DLL.
To use it, import their entry points, e.g. using my msvcrt-light.lib (google for it). But there are still some caveats, especially when using compilers newer than MSVC6:
stack frames are still limited to 4 KBytes
_ftol_sse or _ftol2_sse must be routed to _ftol
_iob_func must be routed to _iob
Its initialization seems to run at load time. At least the file functions will run seemlessly.

Old question, but the answers are either incorrect or focus on one specific problem.
There are a number of C and C++ features that simply will not be available on Windows (or most operating systems, for that matter) if the programs actually started at main/WinMain.
Take this simple example:
class my_class
{
public:
my_class() { m_val = 5; }
int my_func(){ return m_val }
private:
int m_val;
}
my_class g_class;
int main(int argc, char **argv)
{
return g_class.my_func();
}
in order for this program to function as expected, the constructor for my_class must be called before main. If the program started exactly at main, it would require a compiler hack (note: GCC does this in some cases) to insert a function call at the very beginning of main. Instead, on most OSes and in most cases, a different function constructs g_class and then calls main (on Windows, this is either mainCRTStartup or WinMainCRTStartup; on most other OSes I'm used to it is a function called _start).
There's other things C++ and even C require to be done before or after main to work.
How are stdin and stdout (std::cin and std::cout) useable as soon as main starts?
How does atexit work?
The C standard requires the standard library have a POSIX-like signal API, which on Windows must be "installed" before main().
On most OSes, there is no system-provided heap; the C runtime implements its own heap (Microsoft's C runtime just wraps the Kernel32 Heap functions).
Even the arguments passed to main, argc and argv, must be gotten from the system somehow.
You might want to take a look at Matt Pietrick's (ancient) articles on implementing his own C runtime for specifics on how this works with Windows + MSVC (note: MinGW and Cygwin implement specific things differently, but actually fall back to MSVCRT for most things):
http://msdn.microsoft.com/en-us/library/bb985746.aspx

How to reduce compilation cost in GCC and make?

I am trying to build some big libraries, like Boost and OpenCV, from their source code via make and GCC under Ubuntu 8.10 on my laptop. Unfortunately the compilation of those big libraries seem to be big burden to my laptop (Acer Aspire 5000). Its fan makes higher and higher noises until out of a sudden my laptop shuts itself down without the OS gracefully turning off.
So I wonder how to reduce the compilation cost in case of make and GCC?
I wouldn't mind if the compilation will take much longer time or more space, as long as it can finish without my laptop shutting itself down.
Is building the debug version of libraries always less costly than building release version because there is no optimization?
Generally speaking, is it possible to just specify some part of a library to install instead of the full library? Can the rest be built and added into if later found needed?
Is it correct that if I restart my laptop, I can resume compilation from around where it was when my laptop shut itself down? For example, I noticed that it is true for OpenCV by looking at the progress percentage shown during its compilation does not restart from 0%. But I am not sure about Boost, since there is no obvious information for me to tell and the compilation seems to take much longer.
UPDATE:
Thanks, brianegge and Levy Chen! How to use the wrapper script for GCC and/or g++? Is it like defining some alias to GCC or g++? How to call a script to check sensors and wait until the CPU temperature drops before continuing?

I'd suggest creating a wrapper script for gcc and/or g++
#!/bin/bash
sleep 10
exec gcc "$#"
Save the above as "gccslow" or something, and then:
export CC="gccslow"
Alternatively, you can call the script gcc and put it at the front of your path. If you do that, be sure to include the full path in the script, otherwise, the script will call itself recursively.
A better implementation could call a script to check sensors and wait until the CPU temperature drops before continuing.

For your latter question: A well written Makefile will define dependencies as a directed a-cyclical graph (DAG), and it will try to satisfy those dependencies by compiling them in the order according to the DAG. Thus as a file is compiled, the dependency is satisfied and need not be compiled again.
It can, however, be tricky to write good Makefiles, and thus sometime the author will resort to a brute force approach, and recompile everything from scratch.
For your question, for such well known libraries, I will assume the Makefile is written properly, and that the build should resume from the last operation (with the caveat that it needs to rescan the DAG, and recalculate the compilation order, that should be relatively cheap).

Instead of compiling the whole thing, you can compile each target separately. You have to examine the Makefile for identifying them.
Tongue-in-cheek: What about putting the laptop into the fridge while compiling?

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string