Occasionally you meet bugs that are reproducible only in release builds and/or only on some machines. A common (but by no means only) reason is uninitialized variables, that are subject to random behaviour. E.g, an uninitialized BOOL can be TRUE most of the time, on most machines, but randomly be initialized as FALSE.
What I wish I would have is a systematic way of flushing out such bugs by modifying the behaviour of the CRT memory initialization. I'm well aware of the MS debug CRT magic numbers - at the very least I'd like to have a trigger to turn 0xCDCDCDCD (the pattern that initializes freshly allocated memory) to zeros. I suspect one would be able to easily smoke out nasty
initialization pests this way, even in debug builds.
Am I missing an available CRT hook (API, registry key, whatever) that enables this? Anyone has other ideas to get there?
[Edit:] Seems clarifications are in order.
The usual magic numbers indeed have many advantages, but they do not provide coverage for bool initializations (always true), or bit fields that are tested against individual bit masks, or similar cases. Consistent zero initialization (that I'm able to toggle on and off, of course), would add a layer of testing that can surface bad init behaviour that can be rare otherwise.
I'm aware, of course, of CrtSetAllocHook. The hook thus set does not receive a pointer to the allocated buffer (it is called before such buffer was allocated), so it cannot overwrite it. Overloading global new wouldn't do much good either, as it would overwrite any valid constructor initialization.
[Edit:] #Michael, not sure what you mean by overriding new. Simple code like -
void* new(...)
{
void* res = ::new(...); // constructors now called!
if(SomeExternalConditionApplies())
OverWriteBufferWithMyPetValues(res);
}
would not work. Pasting and modifying the entire ::new code might work, but seems kinda scary (god only knows what I'd have to #include and link to get it to run).
I'm not following - having uninitialized memory set to something like 0xcdcdcdcd instead of 0 is better for flushing out bugs, because code is more likely get 'in range' arithmetic or specially handle 0. With the wildly invalid value, bugs are more likely to 'fail fast' so they can be fixed instead of hidden.
The values that the debug build of MSVC uses are specifically designed to help cause failures that can be easily detected:
they aren't 0, so NULL pointer checks against uninitialized memory won't hide the bug
they aren't valid pointers, so dereferencing uninitialized pointers will cause an access violation
they aren't 'usual' integer values, so calculations involving uninitialized data will usually result in wildly incorrect results that will tend to cause noticeable failures (I think be negative when handled as signed data also helps a bit with this, but not as much as simply being unusual numbers).
Also, they're easy to recognize in data displays in the debugger. Zero doesn't stand out nearly as much.
All that said, MSVC provides a number of debug hooks/APIs that you might be able to use to do something along the lines of what you want:
http://msdn.microsoft.com/en-us/library/1666sb98.aspx
Some additional information in response to your updated question:
You might be able to use a 3rd party debug allocation library like Dmalloc (http://dmalloc.com/) but I honestly don't know how easy those libraries are to integrate into an MSVC project, especially 'real world' ones.
Also, note that these obviously will only deal with dynamic allocations (and might not integrate well with MSVC's default new implementation).
You can use a global override of operator new() to deal with allocations that occur using new in C++ - there's not a problem with overwriting any valid constructor initializations - the new allocation occurs before the constructor performs any initialization (if you think about it for a moment it should be clear why that is).
Also, you might want to consider moving to Visual Studio 2010 - it will break into the debugger when you use an uninitialized local variable - doing nothing special other than running the Debug build under the debugger. Of course, MSCV has warned about many of these situations for a while, but VS2010 will catch the following in the debugger, which produces no warnings (at least with my current compiler settings):
int main( )
{
unsigned int x;
volatile bool a = false;
if (a) {
x = 0;
}
printf( "Hello world %u\n", x); // VS2010 will break here because x is uninitialized
return 0;
}
Even the Express version of VC++ 2010 supports this.
Just a suggestion: can you use the static code analysis tool in your compiler? /analyze will give you a C6001 warning that you're using uninitialized memory. And it's somewhat systematic, which is what your asking for.
Stepping into the CRT shows the magic numbers are used in _heap_alloc_dbg and realloc_help, and the value itself is encoded as
static unsigned char _bCleanLandFill = 0xCD; /* fill new objects with this */
Knowing what to search for often helps. The linked thread does have a nice suggestion: set a watch on _bCleanLandFill and modify it from the debugger.
It does work, but I'll keep this question open for a while - I still hope someone has a better idea... I was hoping to run automated tests with controlled initializations, and not have to do it manually (and only with a debugger available).
Related
I'm using Peter Below's PBThreadedSplashForm to display a splash window during application startup.
This component worked great for 10 years, but, since updating my Delphi to 11.2, I get an AV on the CreateWindowEx call.
This happens on Win64 platform only, on problems on Win32.
Anyone who knows what can be the cause of this?
This is one of the many issues that have surfaced in 11.2 due to the new default ASLR settings in the compiler and linker.
After a very quick glance at the source code I see this:
SetWindowLong( wnd, GWL_WNDPROC, Integer( thread.FCallstub ));
thread.FCallstub is defined as Pointer.
Just as I thought.
You see, pointers are of native size, so in 32-bit applications, pointers are 32 bits wide, while in 64-bit applications, pointers are 64 bits wide.
It was very common in the 32-bit world that pointer values were temporarily saved in Integers. This worked because a 32-bit pointer fits in a 32-bit Integer.
But in a 64-bit application, this is an obvious bug, since a 64-bit pointer doesn't fit in a 32-bit Integer. It's like taking a phone number like 5362417812 and truncating it to 17812, hoping that it will still "work".
Of course, in general, this causes bugs such as AVs and memory corruption.
However, until recently, there was a rather high probability that a pointer in a 64-bit Delphi application by "chance" didn't use its 32 upper bits (so it was like maybe $0000000000A3BE41, and so truncating it to $00A3BE41 didn't have any effect). So it seemed to work most of the time, but only by accident.
Now, recent versions of the Delphi compiler and linker enables ASLR, making such accidents much less likely.
And this is a good thing: If you have a serious bug in your code, it is better if you discover it right away and not "randomly" at your customers.
So, to fix the issue, you need to go through the code and make sure you never store a pointer in a 32-bit Integer. Instead, use a native-sized NativeInt, Pointer, LPARAM, or whatever is semantically appropriate.
(Disabling ASLR will also make it work in "many" cases by accident again, but this is a very bad approach. Your software still has a very serious bug that may manifest itself at any time.)
In your code, there is also
Integer( Pchar( FStatusMessage ))
Integer( Pchar( msg ))
The recent leak from Wikileaks has the CIA doing the following:
DO explicitly remove sensitive data (encryption keys, raw collection
data, shellcode, uploaded modules, etc) from memory as soon as the
data is no longer needed in plain-text form.
DO NOT RELY ON THE OPERATING SYSTEM TO DO THIS UPON TERMINATION OF
EXECUTION.
Me being a developer in the *nix world; I'm seeing this as merely changing the value of a variable (ensuring I do not pass by value; and instead by reference); so if it's a string thats 100 characters; writing 0's thats 101 characters. Is it really this simple? If not, why and what should be done instead?
Note: There are similar question that asked this; but it's in the C# and Windows world. So, I do not consider this question a duplicate.
Me being a developer in the *nix world; I'm seeing this as merely
changing the value of a variable (ensuring I do not pass by value; and
instead by reference); so if it's a string thats 100 characters;
writing 0's thats 101 characters. Is it really this simple? If not,
why and what should be done instead?
It should be this simple. The devil is in the details.
memory allocation functions, such as realloc, are not guaranteed to leave memory alone (you should not rely on their doing it one way or the other - see also this question). If you allocate 1K of memory, then realloc it to 10K, your original K might still be there somewhere else, containing its sensitive payload. It might then be allocated by another insensitive variable or buffer, or not, and through the new variable, it might be possible to access a part or all of the old content, much as it happened with slack space on some filesystems.
manually zeroing memory (and, with most compilers, bzero and memset count as manual loops) might be blithely optimized out, especially if you're zeroing a local variable ("bug" - actually a feature, with workaround).
some functions might leave "traces" in local buffers or in memory they allocate and deallocate.
in some languages and frameworks, whole portions of data could end up being moved around (e.g. during so-called "garbage collection", as noticed by #gene). You may be able to tell the GC not to process your sensitive area or otherwise "pin" it to that effect, and if so, must do so. Otherwise, data might end up in multiple, partial copies.
information might have come through and left traces you're not aware of (trivial example: a password sent through the network might linger in the network library read buffer).
live memory might be swapped out to disk.
Example of realloc doing its thing. Memory gets partly rewritten, and with some libraries this will only "work" if "a" is not the only allocated area (so you need to also declare c and allocate something immediately after a, so that a is not the last object and left free to grow):
int main() {
char *a;
char *b;
a = malloc(1024);
strcpy(a, "Hello");
strcpy(a + 200, "world");
printf("a at %08ld is %s...%s\n", a, a, a + 200);
b = realloc(a, 10240);
strcpy(b, "Hey!");
printf("a at %08ld is %s...%s, b at %08ld is %s\n", a, a, a + 200, b, b);
return 0;
}
Output:
a at 19828752 is Hello...world
a at 19828752 is 8????...world, b at 19830832 is Hey!
So the memory at address a was partly rewritten - "Hello" is lost, "world" is still there (as well as at b + 200).
So you need to handle reallocations of sensitive areas yourself; better yet, pre-allocate it all at program startup. Then, tell the OS that a sensitive area of memory must never be swapped to disk. Then you need to zero it in such a way that the compiler can't interfere. And you need to use a low-level enough language that you're sure doesn't do things by itself: a simple string concatenation could spawn two or three copies of the data - I'm fairly certain it happened in PHP 5.2.
Ages ago I wrote myself a small library - there wasn't valgrind yet - inspired by Steve Maguire's Writing Solid Code, and apart from overriding the various memory and string functions, I ended up overwriting memory and then calculating the checksum of the overwritten buffer. This not for security, I used it to track buffer over/under flows, double frees, use of freed memory -- this kind of things.
And then you need to ensure your failsafes work - for example, what happens if the program aborts? Might it be possible to make it abort on purpose?
You need to implement defense in depth, and always look at ways to keep as little information around as possible - for example clearing the intermediate buffers during a calculation rather than waiting and freeing the whole lot in one fell swoop at the very end, or just when exiting the program; keeping hashes instead of passwords when at all possible; and so on.
Of course all this depends on how sensitive the information is and what the attack surface is likely to be (mandatory xkcd reference: here). Rebooting the PC with a memtest86 image could be a viable alternative. Think of a dual-boot computer with memtest86 set to test memory and power down the PC as default boot option. When you want to turn off the system... you reboot it instead. The PC will reboot, enter memtest86 by default, and before powering off for good, it'll start filling all available RAM with marching troops of zeros and ones. Good luck freeze-booting information from that.
Zeroing out secrets (passwords, keys, etc) immediately after you are done with them is fairly standard practice. The difficulty is in dealing with language and platform features that can get in your way.
For example, C++ compilers can optimize out calls to memset if it determines that the data is not read after the write. Or operating systems may have paged the memory out to disk, potentially leaving the data available that way.
I have quite complex I/O program (written by someone else) for controller ICPDAS i-7188ex and I am writing a library (.lib) for it that does some calculations based on data from that program.
Problem is, if I import function with only one line printf("123") and embed it inside I/O, program crashes at some point. Without imported function I/O works fine, same goes for imported function without I/O.
Maybe it is a memory issue but why should considerable memory be allocated for function which only outputs a string? Or I am completely wrong?
I am using Borland C++ 3.1. And yes, I can't use anything newer since controller takes only 80186 instruction set.
If your code is complex then sometimes your compiler can get stuck and compile it wrongly messing things up with unpredictable behavior. Happen to me many times when the code grows ... In such case usually swapping few lines of code (if you can without breaking functionality) or even adding few empty or rem lines inside code sometimes helps. Problem is to find the place where it do its thing. You can also divide your program into several files compile each separately to obj and then just link them to the final file ...
The error description remembers me of one I did fight with a long time. If you are using class/struct/template try this:
bds 2006 C hidden memory manager conflicts
may be it will help (did not test this for old turbo).
What do you mean by embed into I/O ? are you creating a sys driver file? If that is the case you need to make sure you are not messing with CPU registers. That could cause a lot of problems try to use
void some_function_or_whatever()
{
asm { pusha };
// here your code
printf("123");
asm { popa };
}
If you writing ISR handlers then you need to use interrupt keyword so compiler returns from it properly.
Without actual code and or MCVE is hard to point any specifics ...
If you can port this into BDS2006 or newer version (just for debug not really functional) then it will analyse your code more carefully and can detect a lot of hidden errors (was supprised when I ported from BCB series into BDS2006). Also there is CodeGuard option in the compiler which is ideal for finding such errors on runtime (but I fear you will not be able to run your lib without the I/O hw present in emulated DOS)
I'm hunting for some memory-leaks in a long runing service (using F#) right now.
The only "strange" thing I've seen so far is the following:
I use a MailboxProcessor in a subsystem with an algebraic-datatype named QueueChannelCommands (more or less a bunch of Add/Get commands - some with AsyncReplyChannels attached)
when I profile the service (using Ants Memory Profiler) I see instances of arrays of mentioned type (most having lenght 4, but growing) - all empty (null) whose references seems to be held by Control.Mailbox:
I cannot see any reason in my code for this behaviour (your standard code you can find in every Mailbox-example out there - just a loop with a let! = receive and a match to follow ended with a return! loop()
Has anyone seen this kind of behaviour before or even knows how to handle this?
Or is this even a (known) bug?
Update: the growing of the arrays is really strange - seems like there is additional space appended without beeing used properly:
I am not a F# expert by any means but maybe you can look at the first answer in this thread:
Does Async.StartChild have a memory leak?
The first reply mentions a tutorial for memory profiling on the following page:
http://moiraesoftware.com/blog/2011/12/11/fixing-a-hole/
But they mention this open source version of F#
https://github.com/fsharp/fsharp/blob/master/src/fsharp/FSharp.Core/control.fs
And I am not sure it is what you are looking for (about this open source version of F# in the last point), but maybe it can help you to find the source of the leak or prove that it is actually leaking memory.
Hope that helps somehow maybe ?
Tony
.NET has its own garbage collector, which works quite nicely.
The most common way to cause memory leaks in .NET technologies is by setting up delegates, and not removing them on object deconstructors.
This is part of a series of at least two closely related, but distinct questions. I hope I'm doing the right thing by asking them separately.
I'm trying to get my Visual C++ 2008 app to work without the C Runtime Library. It's a Win32 GUI app without MFC or other fancy stuff, just plain Windows API.
So I set Project Properties -> Configuration -> C/C++ -> Advanced -> Omit Default Library Names to Yes (compiler flag /Zl) and rebuilt.
Then the linker complains about an unresolved external _WinMainCRTStartup. Fair enough, I can tell the linker to use a different entry point, say MyStartup. From what I gather around the web, _WinMainCRTStartup does some initialization stuff, and I probably want MyStartup to do a subset of that.
So my question is: What functions does _WinMainCRTStartup perform, and which of these can I omit if I don't use the CRT?
If you are knowledgeable about this stuff, please have a look at my other question too. Thanks!
Aside: Why do I want to do this in the first place?
My app doesn't explicitly use any CRT functions.
I like lean and mean apps.
It'll teach me something new.
The CRT's entry point does the following (this list is not complete):
Initializes global state needed by the CRT. If this is not done, you cannot use any functions or state provided by the CRT.
Initializes some global state that is used by the compiler. Run-time checks such as the security cookie used by /GS definitely stands out here. You can call __security_init_cookie yourself, however. You may need to add other code for other run-time checks.
Calls constructors on C++ objects. If you are writing C++ code, you may need to emulate this.
Retrieves command line and start up information provided by the OS and passes it your main. By default, no parameters are passed to the entry point of the program by the OS - they are all provied by the CRT.
The CRT source code is available with Visual Studio and you can step through the CRT's entry point in a debugger and find out exactly what it is doing.
A true Win32 program written in C (not C++) doesn't need any initialization at all, so you can start your project with WinMainCRTStartup() instead of WinMain(HINSTANCE,...).
It's also possible but a bit harder to write console programs as true Win32 applications; the default name of entry point is _mainCRTStartup().
Disable all extra code generation features like stack probes, array checks etc. Debugging is still possible.
Initialization
Sometimes you need the first HINSTANCE parameter. For Win32 (except Win32s), it is fixed to (HINSTANCE)0x400000.
The nCmdShow parameter is always SW_SHOWDEFAULT.
If necessary, retrieve the command line with GetCommandLine().
Termination
When your program spawns threads, e.g. by calling GetOpenFileName(), returning from WinMainCRTStartup() with return keyword will hang your program — use ExitProcess() instead.
Caveats
You will run into considerable trouble when:
using stack frames (i.e. local variables) larger than 4 KBytes (per function)
using float-point arithmetic (e.g. float->int conversion)
using 64-bit integers on 32-bit machines (multiply, bit-shift operations)
using C++ new, delete, and static objects with non-zero-out-all-members constructors
using standard library functions like fopen(), printf() of course
Troubleshoot
There is a C standard library available on all Windows systems (since Windows 95), the MSVCRT.DLL.
To use it, import their entry points, e.g. using my msvcrt-light.lib (google for it). But there are still some caveats, especially when using compilers newer than MSVC6:
stack frames are still limited to 4 KBytes
_ftol_sse or _ftol2_sse must be routed to _ftol
_iob_func must be routed to _iob
Its initialization seems to run at load time. At least the file functions will run seemlessly.
Old question, but the answers are either incorrect or focus on one specific problem.
There are a number of C and C++ features that simply will not be available on Windows (or most operating systems, for that matter) if the programs actually started at main/WinMain.
Take this simple example:
class my_class
{
public:
my_class() { m_val = 5; }
int my_func(){ return m_val }
private:
int m_val;
}
my_class g_class;
int main(int argc, char **argv)
{
return g_class.my_func();
}
in order for this program to function as expected, the constructor for my_class must be called before main. If the program started exactly at main, it would require a compiler hack (note: GCC does this in some cases) to insert a function call at the very beginning of main. Instead, on most OSes and in most cases, a different function constructs g_class and then calls main (on Windows, this is either mainCRTStartup or WinMainCRTStartup; on most other OSes I'm used to it is a function called _start).
There's other things C++ and even C require to be done before or after main to work.
How are stdin and stdout (std::cin and std::cout) useable as soon as main starts?
How does atexit work?
The C standard requires the standard library have a POSIX-like signal API, which on Windows must be "installed" before main().
On most OSes, there is no system-provided heap; the C runtime implements its own heap (Microsoft's C runtime just wraps the Kernel32 Heap functions).
Even the arguments passed to main, argc and argv, must be gotten from the system somehow.
You might want to take a look at Matt Pietrick's (ancient) articles on implementing his own C runtime for specifics on how this works with Windows + MSVC (note: MinGW and Cygwin implement specific things differently, but actually fall back to MSVCRT for most things):
http://msdn.microsoft.com/en-us/library/bb985746.aspx