debugging a strange error when new is called - visual-c++

I have a program that when run compiled with Microsoft Visual C++ 2008 Express crashes on the line
comparison_vectors = new vec_element[(rbfnetparams->comparison_vector_length)+1];
with the error Unhandled exception at 0x7c93426d in myprog.exe: 0xC0000005: Access violation reading location 0x00000000
rbfnetparams->comparison_vector_length evaluates to 4 (should do and checked in the debugger), and the thing still crashes here when I change the line as a test to:
comparison_vectors = new vec_element[5];
vec_element is a structure with several ints, doubles and a few bools, but no methods or constructor. The thing runs if I replace new with malloc, but then crashes on another new elsewhere. It does not crash every time this line is run, only sometimes, but seems to do so after the same number of iterations of this line each time. Memory usage is only 10MB at this point in the program.
This gets stranger as the same program DOES compile and run under gcc on Solaris, which usually shows up far more errors than Windows does.
Any help would be appreciated, as I am at a loss as to how to debug this one.

Access violation reading location 0x00000000 means "you dereferenced a NULL pointer." It looks like once in a while rbfnetparams is NULL when you reach this line, and thus you get the error.
I can't explain why comparison_vectors = new vec_element[5]; crashes. Is it the same error message?
Check if rbfnetparams is NULL before the line, and see if it gets hit (or add a conditional break point). Then decide if the fact that rbfnetparams is NULL is a symptom of a bigger bug somewhere else.
Dereferencing a NULL pointer is undefined. It's possible that the Solaris compiler does an optimization that masks the bug. That's allowed by the Standard (read the whole series referenced from that post).

Related

"shmop_open(): unable to attach or create shared memory segment 'No error':"?

I get this every time I try to create an account to ask this on Stack Overflow:
Oops! Something Bad Happened!
We apologize for any inconvenience, but an unexpected error occurred while you were browsing our site.
It’s not you, it’s us. This is our fault.
That's the reason I post it here. I literally cannot ask it on Overflow, even after spending hours of my day (on and off) repeating my attempts and solving a million reCAPTCHA puzzles. Can you maybe fix this error soon?
With no meaningful/complete examples, and basically no documentation whatsoever, I've been trying to use the "shmop" part of PHP for many years. Now I must find a way to send data between two different CLI PHP scripts running on the same machine, without abusing the database for this. It must work without database support, which means I'm trying to use shmop, but it doesn't work at all:
$shmopid = shmop_open(1, 'w', 0644, 99999); // I have no idea what the "key" is supposed to be. It says: "System's id for the shared memory block. Can be passed as a decimal or hex.", so I've given it a 1 and also tried with 123. It gave an error when I set the size to 64, so I increased it to 99999. That's when the error changed to the one I now face above.
shmop_write($shmopid, 'meow 123', 0); // Write "meow 123" to the shared variable.
while (1)
{
$shared_string = shmop_read($shmopid, 0, 8); // Read the "meow 123", even though it's the same script right now (since this is an example and minimal test).
var_dump($shared_string);
sleep(1);
}
I get the error for the first line:
shmop_open(): unable to attach or create shared memory segment 'No error':
What does that mean? What am I doing wrong? Why is the manual so insanely cryptic for this? Why isn't this just a built-in "superarray" that can be accessed across the scripts?
About CLI:
It cannot work in standalone CLI processes, as an answer here says:
https://stackoverflow.com/a/34533749
The master process is the one to hold the shared memory block, so you will have to use php-fpm or mod_php or some other web/service-running version, and maybe even start/request/stop it all from a CLI php script.
About shmop usage itself:
Use "c" mode in shmop_open() for creating the segment before it can be used with "a" or "w".
I stumbled on this error in a different scenario where shared memory is completely optional to speed up some repeated operations. So I wanted to try reading first without knowing memory size and then allocate from actual data when needed. In my case I had to call it as #shmop_open() to hide this error output.
About shmop on Windows:
PHP 7 crashed Apache worker process causing its restart with status 3221225477 when trying to reallocate a segment with the same predefined (arbitrary number) key but different size, even after shmop_delete(). As a workaround for this, I took filemtime() of the source file which contains data to be stored in memory, and used that timestamp as the key in shmop_open(). It still was not flawless IIRC, and I don't know if it would cause memory leaks, but it was enough to test my code which would mainly run on Linux anyway.
Finally, as of PHP version 8.0.7, shmop seems to work fine with Apache 2.4.46 and mod_php in Windows 10.

Linux filp_open error number definitions

I have a question about the filp_open function:
I can get the error number from the IS_ERR function but I do not understand the meaning of the error number.
Where can find the filp_open error number definitions?
You should not use filp_open to read/write files in kernel mode. For (obvious) security reasons. Other reasons can be found in this answer and this answer (taken from this comment). The official documentation also recomends not to use flp_open:
This is the helper to open a file from kernelspace if you really have to. But in generally you should not do this, so please move along, nothing to see here..
Error code definitions
The kernel uses the same error numbers (errno) in kernel space as in the user space. So, as OmnipotentEntity pointed out, you can see man errno for a reference on what the errors generally mean.
It is also helpful to have a look at the actual implementation of filp_open and its possible error sources, such as file_open_name and build_open_flags.
Note that IS_ERR does not return the error but merely returns whether the supplied pointer is an error value or not. You have to use PTR_ERR to retrieve the error value from the pointer in case IS_ERR is true. Example:
fptr = filp_open(...)
if (IS_ERR(fptr)) {
printk("%d\n", PTR_ERR(fptr));
}

Copy Constructor for MyString causes HEAP error. Only Gives Error in Debug Mode

So, I've never experienced this before. Normally when I get an error, it always triggers a breakpoint. However, this time when I build the solution and run it without debugging (ctrl+F5), it gives me no error and runs correctly. But when I try debugging it (F5), it gives me this error:
HEAP[MyString.exe]: HEAP: Free Heap block 294bd8 modified at 294c00 after it was freed
Windows has triggered a breakpoint in MyString.exe.
This may be due to a corruption of the heap, which indicates a bug in MyString.exe or any of the DLLs it has loaded.
This may also be due to the user pressing F12 while MyString.exe has focus.
The output window may have more diagnostic information.
This assignment is due tonight, so I'd appreciate any quick help.
My code is here:
https://gist.github.com/anonymous/8d84b21be6d1f4bc18bf
I've narrowed the problem down in the main to line 18 in main.cpp ( c = a + b; ) The concatenation succeeds, but then when it is to be copied into c, the error message occurs at line 56 in MyString.cpp ( pData = new char[length + 1]; ).
The kicker is I haven't had a problem with this line of code until I tried overloading the operator>>. I've since scrapped that code for the sake of trying to debug this.
Again, any help would be appreciated!
Let's go through line 18:
1. In line 17 you create string c with dynamically allocated memory inside.
2. You make assignment: c = a + b:
2.1. Operator+ creates LOCAL object 'cat'.
2.2. cat's memory is allocated dynamically.
2.3. cat becomes concatenation of two given strings.
2.4. Operator+ exits. cat is LOCAL object and it's being destroyed.
2.4.1. cat is being destroyed. cat's destructor runs.
2.4.2. destructor deletes pData;
2.4.3. After delete you make *pData = NULL. //ERROR - should be pData = NULL (1)
2.5. c is initialized with result of operator+.
2.6. operator= calls copy().
2.7. copy() allocates new memory without checking the current one. //ERROR - memory leak (2)
(1)
pData is char*. In destructor we have: delete[] pData (deletes memory) and then *pData = NULL.
Because pData is a pointer, *pData is same as pData[0]. So you write to already freed memory. This is the cause of your error.
(2)
Additional problem. Copy() overwrites current memory without checking. Should be:
copy()
{
if(this->pData)
{
delete this->pData;
}
//now allocate new buffer and copy
}
Also, when dealing with raw bytes (chars), you don't want to use new() and delete(), but malloc() and free() instead. In this case, in functions like copy(), instead of calling delete() and then new(), you would simply use realloc().
EDIT:
One more thing: errors caused by heap damage usually occur during debugging. In release binary, this will simply overwrite some freed (and maybe already used by someone else) memory. That's why debugging is so important when playing with memory in C++.

gdb break when program opens specific file

Back story: While running a program under strace I notice that '/dev/urandom' is being open'ed. I would like to know where this call is coming from (it is not part of the program itself, it is part of the system).
So, using gdb, I am trying to break (using catch syscall open) program execution when the open call is issued, so I can see a backtrace. The problem is that open is being called alot, like several hundred times so I can't narrow down the specific call that is opening /dev/urandom. How should I go about narrowing down the specific call? Is there a way to filter by arguments, and if so how do I do it for a syscall?
Any advice would be helpful -- maybe I am going about this all wrong.
GDB is a pretty powerful tool, but has a bit of a learning curve.
Basically, you want to set up a conditional breakpoint.
First use the -i flag to strace or objdump -d to find the address of the open function or more realistically something in the chain of getting there, such as in the plt.
set a breakpoint at that address (if you have debug symbols, you can use those instead, omitting the *, but I'm assuming you don't - though you may well have them for library functions if nothing else.
break * 0x080482c8
Next you need to make it conditional
(Ideally you could compare a string argument to a desired string. I wasn't getting this to work within the first few minutes of trying)
Let's hope we can assume the string is a constant somewhere in the program or one of the libraries it loads. You could look in /proc/pid/maps to get an idea of what is loaded and where, then use grep to verify the string is actually in a file, objdump -s to find it's address, and gdb to verify that you've actually found it in memory by combining the high part of the address from maps with the low part from the file. (EDIT: it's probably easier to use ldd on the executable than look in /proc/pid/maps)
Next you will need to know something about the abi of the platform you are working on, specifically how arguments are passed. I've been working on arm's lately, and that's very nice as the first few arguments just go in registers r0, r1, r2... etc. x86 is a bit less convenient - it seems they go on the stack, ie, *($esp+4), *($esp+8), *($esp+12).
So let's assume we are on an x86, and we want to check that the first argument in esp+4 equals the address we found for the constant we are trying to catch it passing. Only, esp+4 is a pointer to a char pointer. So we need to dereference it for comparison.
cond 1 *(char **)($esp+4)==0x8048514
Then you can type run and hope for the best
If you catch your breakpoint condition, and looking around with info registers and the x command to examine memory seems right, then you can use the return command to percolate back up the call stack until you find something you recognize.
(Adapted from a question edit)
Following Chris's answer, here is the process that eventually got me what I was looking for:
(I am trying to find what functions are calling the open syscall on "/dev/urandom")
use ldd on executable to find loaded libraries
grep through each lib (shell command) looking for 'urandom'
open library file in hex editor and find address of string
find out how parameters are passed in syscalls (for open, file is first parameter. on x86_64 it is passed in rdi -- your mileage may vary
now we can set the conditional breakpoint: break open if $rdi == _addr_
run program and wait for break to hit
run bt to see backtrace
After all this I find that glib's g_random_int() and g_rand_new() use urandom. Gtk+ and ORBit were calling these functions -- if anybody was curious.
Like Andre Puel said:
break open if strcmp($rdi,"/dev/urandom") == 0
Might do the job.

Visual C++ App crashes before main in Release, but runs fine in Debug

When in release it crashes with an unhandled exception: std::length error.
The call stack looks like this:
msvcr90.dll!__set_flsgetvalue() Line 256 + 0xc bytes C
msvcr90.dll!__set_flsgetvalue() Line 256 + 0xc bytes C
msvcr90.dll!_getptd_noexit() Line 616 + 0x7 bytes C
msvcr90.dll!_getptd() Line 641 + 0x5 bytes C
msvcr90.dll!rand() Line 68 C
NEM.exe!CGAL::Random::Random() + 0x34 bytes C++
msvcr90.dll!_initterm(void (void)* * pfbegin=0x00000003, void (void)* * pfend=0x00345560) Line 903 C
NEM.exe!__tmainCRTStartup() Line 582 + 0x17 bytes C
kernel32.dll!7c817067()
Has anyone got any clues?
Examining the stack dump:
InitTerm is simply a function that walks a list of other functions and executes each in step - this is used for, amongst other things, global constructors (on startup), global destructors (on shutdown) and atexit lists (also on shutdown).
You are linking with CGAL, since that CGAL::Random::Random in your stack dump is due to the fact that CGAL defines a global variable called default_random of the CGAL::Random::Random type. That's why your error is happening before main, the default_random is being constructed.
From the CGAL source, all it does it call the standard C srand(time(NULL)) followed by the local get_int which, in turn, calls the standard C rand() to get a random number.
However, you're not getting to the second stage since your stack dump is still within srand().
It looks like it's converting your thread into a fiber lazily, i.e., this is the first time you've tried to do something in the thread and it has to set up fiber-local storage before continuing.
So, a couple of things to try and investigate.
1/ Are you running this code on pre-XP? I believe fiber-local storage (__set_flsgetvalue) was introduced in XP. This is a long shot but we need to clear it up anyway.
2/ Do you need to link with CGAL? I'm assuming your application needs something in the CGAL libraries, otherwise don't link with it. It may be a hangover from another project file.
3/ If you do use CGAL, make sure you're using the latest version. As of 3.3, it supports a dynamic linking which should prevent the possibility of mixing different library versions (both static/dynamic and debug/nondebug).
4/ Can you try to compile with VC8? The CGAL supported platforms do NOT yet include VC9 (VS2008). You may need to follow this up with the CGAL team itself to see if they're working on that support.
5/ And, finally, do you have Boost installed? That's another long shot but worth a look anyway.
If none of those suggestions help, you'll have to wait for someone more knowledgable than I to come along, I'm afraid.
Best of luck.
Crashes before main() are usually caused by a bad constructor in a global or static variable.
Looks like the constructor for class Random.
Do you have a global or static variable of type Random? Is it possible that you're trying to construct it before the library it's in has been properly initialized?
Note that the order of construction of global and static variables is not fixed and might change going from debug to release.
Could you be more specific about the error you're receiving? (unhandled exception std::length sounds weird - i've never heard of it)
To my knowledge, FlsGetValue automatically falls back to TLS counterpart if FLS API is not available.
If you're still stuck, take .dmp of your process at the time of crash and post it (use any of the numerous free upload services - and give us a link) (Sounds like a missing feature in SO - source/data file exchange?)

Resources