I am hacking a plugin that requires the host-application to allow an executable stack.
This can be achieved by running
execstack -s /path/to/my/host
However, if the host application lacks the executable stack flag (e.g. the above command has not been called), running my plugin simply crashes the host:
Program received signal SIGSEGV, Segmentation fault.
I would like to avoid the crash, e.g. by disabling parts of my code automatically if the executable stack lag is not set.
The check should happen at runtime during the plugin initialization
However, I haven't found any documentation on how to detect the availability of the executable stack at runtime (without crashing).
The only thing I have found so far is execstack -q /path/to/my/host, but that seems hacky to run from within a plugin loaded by /path/to/my/host.
so it seems there is better solution to my problem than querying the protection scheme in runtime: explicitely flag a memory to be executable, using
int mprotect(void *addr, size_t len, int prot);
this basically adds an exception to the executable stack protection in a well defined memory region.
Related
I have a very complex cross-platform application. Recently my team and I have been running stress tests and have encountered several crashes (and core dumps accompanying them). Some of these core dumps are very precise, and show me the exact location where the crash occurred with around 10 or more stack frames. Others sometimes have just one stack frame with ?? being the only symbol!
What I'd like to know is:
Is there a way to increase the probability of core dumps pointing in the right direction?
Why isn't the number of stack frames reported consistent?
Any best practice advise for managing core dumps.
Here's how I compile the binaries (in release mode):
Compiler and platform: g++ with glibc-2.3.2-95.50 on CentOS 3.6 x86_64 -- This helps me maintain compatibility with older versions of Linux.
All files are compiled with the -g flag.
Debug symbols are stripped from the final binary and saved in a separate file.
When I have a core dump, I use GDB with the executable which created the core, and the symbols file. GDB never complains that there's a mismatch between the core/binary/symbols.
Yet I sometimes get core dumps with no symbols at all! It's understandable that I'm linking against non-debug version of libstdc++ and libgcc, but it would be nice if at least the stack trace shows me where in my code did the faulty instruction call originate (although it may ultimately end in ??).
Others sometimes have just one stack frame with "??" being the only symbol!
There can be many reasons for that, among others:
the stack frame was trashed (overwritten)
EBP/RBP (on x86/x64) is currently not holding any meaningful value — this can happen e.g. in units compiled with -fomit-frame-pointer or asm units that do so
Note that the second point may occur simply by, for example, glibc being compiled in such a way. Having the debug info for such system libraries installed could mitigate this (something like what the glibc-debug{info,source} packages are on openSUSE).
gdb has more control over the program than glibc, so glibc's backtrace call would naturally be unable to print a backtrace if gdb cannot do so either.
But shipping the source would be much easier :-)
As an alternative, on a glibc system, you could use the backtrace function call (or backtrace_symbols or backtrace_symbols_fd) and filter out the results yourself, so only the symbols belonging to your own code are displayed. It's a bit more work, but then, you can really tailor it to your needs.
Have you tried installing debugging symbols of the various libraries that you are using? For example, my distribution (Ubuntu) provides libc6-dbg, libstdc++6-4.5-dbg, libgcc1-dbg etc.
If you're building with optimisation enabled (eg. -O2), then the compiler can blur the boundary between stack frames, for example by inlining. I'm not sure that this would cause backtraces with just one stack frame, but in general the rule is to expect great debugging difficulty since the code you are looking it in the core dump has been modified and so does not necessarily correspond to your source.
Recently We have a production issue with application freezed, we tried to break in and analyse the dump file, unfortunately the call stack for the dump file does not looks good and hard to track down the cause of the freeze.
Two reasons why a call stack might look incorrect:
The stack might be corrupted. If the stack was corrupted for some reason (for instance, due to an overflow of a buffer which was allocated on the stack), all the stack frames are destroyed. This makes it impossible to compute the list of callers.
The symbols you use (if any) might not be appropriate for the binary which crashed. You need to use the exact same symbols which were used when compiling the binary. A slight change to the source code can render all the symbols invalid.
If the application hung rather than crashed, try loading into windbg and run !analyze -v -hang or try using adplus in hang mode. This tries to determine the cause of the hang which should give you a more meaningful call stack. The !locks command can be useful too if you have deadlock by showing you what is blocking on a resource.
If you call into the Windows API's, which then call back into you on the same thread (via Windows message handlers, for instance), it is not uncommon for the operations in the Windows DLL's to use stack conventions that the debugger cannot interpret. There's no requirement that the stack be traceable all the time during the execution of a c/c++ function/method, the stack-related registers can be re-used for other purposes and the standard places of stashing stack information can be ignored. I see this a lot in Windows.
I am writing an editor in assembly 64bit mode in linux. It runs correctly when I debug the program in GDB but it does not run correctly when I run it normally it means it has runtime errors when I use ./programName .
You're probably accessing uninitialized data or have some kind of memory corruption problem. This would explain the program behaving differently when run in the debugger - you're seeing the results of undefined behavior.
Run your program through valgrind's memcheck tool and see what it outputs. Valgrind is a powerful tool that will identify many runtime errors on Linux, including a full stack trace to the error.
If GDB disabling ASLR is what makes it work, perhaps set disable-randomization off in GDB will let your reproduce a crash inside GDB so you can debug it. Force gdb to load shared library at randomized address.
Otherwise enable core dumps from your program, and use GDB on the core dump.
gdb ./prog core.1234.
On x86, you can insert a ud2 instruction in your asm source to intentionally cause a crash at whatever point you want in your code, if you want to get a coredump to examine registers/memory at some point before it crashes on its own. All architectures have an undefined instruction you can use, but I only know the mnemonic for x86's off the top of my head.
I want to get address of instruction that causes external program to SIGSEGV. I tried using ptrace for this, but I'm getting EIP from kernel space (probably default signal handler?). How GDB is able to get the correct EIP?
Is there a way to make GDB provide this information using some API?
edit:
I don't have sources of the program, only binary executable. I need automation, so I can't simply use "run", "info registers" in GDB. I want to implement "info registers" in my own mini-debugger :)
You can attach to a process using ptrace. I found an article at Linux Gazette.
It looks like you will want PTRACE_GETREGS for the registers. You will want to look at some example code like strace to see how it manages signal handling and such. It looks to me from reading the documentation that the traced child will stop at every signal and the tracing parent must wait() for the signal from the child then command it to continue using PTRACE_CONT.
Compile your program with -g, run gdb <your_app>, type run and the error will occur. After that use info registers and look in the rip register.
You can use objectdump -D <your_app> to get some more information about the code at that position.
You can enable core dumps with ulimit -c unlimited before running your external program.
Then you can examine the core dump file after a crash using gdb /path/to/program corefile
Because it is binary and not compiled with debugging options you will have to view details at the register and machine code level.
Try making a core dump, then analyse it with gdb. If you meant you wanted to make gdb run all your commands at one touch of a key by 'automate', gdb ca do that too. Type your commands into a file and look into the help user-defined section of manuals, gdb can handle canned commands.
In Linux environment, when getting "glibc detected *** free(): invalid pointer" errors, how do I identify which line of code is causing it?
Is there a way to force an abort? I recall there being an ENV var to control this?
How to set a breakpoint in gdb for the glibc error?
I believe if you setenv MALLOC_CHECK_ to 2, glibc will call abort() when it detects the "free(): invalid pointer" error. Note the trailing underscore in the name of the environment variable.
If MALLOC_CHECK_ is 1 glibc will print "free(): invalid pointer" (and similar printfs for other errors). If MALLOC_CHECK_ is 0, glibc will silently ignore such errors and simply return. If MALLOC_CHECK_ is 3 glibc will print the message and then call abort(). I.e. its a bitmask.
You can also call mallopt(M_CHECK_ACTION, arg) with an argument of 0-3, and get the same result as with MALLOC_CHECK_.
Since you're seeing the "free(): invalid pointer" message I think you must already be setting MALLOC_CHECK_ or calling mallopt(). By default glibc does not print those messages.
As for how to debug it, installing a handler for SIGABRT is probably the best way to proceed. You can set a breakpoint in your handler or deliberately trigger a core dump.
I recommend you get valgrind:
valgrind --tool=memcheck --leak-check=full ./a.out
In general, it looks like you might have to recompile glibc, ugh.
You don't say what environment you're running on, but if you can recompile your code for OS X, then its version of libc has a free() that listens to this environment variable:
MallocErrorAbort If set, causes abort(3) to be called if an
error was encountered in malloc(3) or
free(3) , such as a calling free(3) on a
pointer previously freed.
The man page for free() on OS X has more information.
If you're on Linux, then try Valgrind, it can find some impossible-to-hunt bugs.
How to set a breakpoint in gdb?
(gdb) b filename:linenumber
// e.g. b main.cpp:100
Is there a way to force an abort? I recall there being an ENV var to control this?
I was under the impression that it aborted by default. Make sure you have the debug version installed.
Or use libdmalloc5: "Drop in replacement for the system's malloc',realloc', calloc',free' and other memory management routines while providing powerful debugging facilities
configurable at runtime. These facilities include such things as memory-leak tracking, fence-post write detection, file/line number reporting, and general logging of statistics."
Add this to your link command
-L/usr/lib/debug/lib -ldmallocth
gdb should automatically return control when glibc triggers an abort.
Or you can set up a signal handler for SIGABRT to dump the stacktrace to a fd (file descriptor). Below, mp_logfile is a FILE*
void *array[512 / sizeof(void *)]; // 100 is just an arbitrary number of backtraces, increase if you want.
size_t size;
size = backtrace (array, 512 / sizeof(void *));
backtrace_symbols_fd (array, size, fileno(mp_logfile));