How to get the gdb call stack trace? - linux

I have a core dump and a file where debug information is stored, can I use gdb without using an executable file to get a call stack with the name of functions and lines?

can I use gdb without using an executable file to get a call stack with the name of functions and lines?
At least on Linux/x86_64, the answer is no: the info saved after objcopy --only-keep-debug is not sufficient; you also need the executable file.
This is happening (at least in part) because the debug_file does not have the .eh_frame section, which is necessary for unwinding on x86_64.
If you are debugging the core dumps yourself, there is no reason to create debug_file -- just keep the original executable with full debug info for debugging (you can still ship a smaller stripped file to execution machines).

Related

How addr2line can locate the source file and the line of code?

addr2line translates addresses into file names and line numbers. I am still beginner in debugging, and have some questions about addr2line.
If am debugging a certain .so (binary) file, how the tool can locate
its source code file (from where can get it!), what if the source doesn't exist?
What is the relation between the address in a binary and the line
number in its source, so addr2line can do this kind of mapping?
In general, addr2line works best on ELF executables or shared libraries with debug information. That debug information is emitted by the compiler when you pass -g (or -g2, etc...) to GCC. It notably provides a mapping between source code location (name of source file, line number, column number) and functions, variable names, call stack frame organization, etc etc... The debug information is today in DWARF format (and is also processed by the gdb debugger, the libbacktrace library, etc etc...). Notice that the debug information contains source file paths (not the source file itself).
In practice, you can (and often should) pass the -g (or -g2) debugging option to GCC even with optimization flags like -O2. In that case, the debug information is slightly less precise but still practically usable. In some cases, stack frames may disappear (inlined function calls, tail call optimizations, ....).
You could use the strip(1) utility to remove debug information (and other symbol tables, etc) from some ELF executable.

game renderer thread backtrace with no symbols on linux

I have a game application running in linux. We are a gaming company. I am having this random crash that occurs like once in 24-48 hours. The last time it occurred I tried to see the backtrace of the thread where it crashed, however gdb showed that the stack was corrupted with no symbols.
Now, when I run the game and interrupt the gdb, sometimes I am able to see function call stack for this thread but most of the times I do not see any symbols.The thread is a renderer thread.
Some of the game libraries we are using is proprietary third party with no debugging symbols. So I was wondering could it be that the renderer thread call stack is deep(various calls within library) into these libraries without symbols and so I do not get to see the call stack ? If that is true, how can I fix this ?
If not, any idea what could be the cause.
(gdb) bt
#0 0x9f488882 in ?? ()
Also, did a info proc mappings and for the address above in bt I found the following:
0x9f488000 0x9f48a000 0x2000 0x0 /tmp/glyFI8DP (deleted)
This means that your third-party library is using just-in-time compilation to generate some code, mmap it into your process, and deletes it.
On x86_64, GDB needs unwind descriptors to unwind the stack, but it can't get them from the deleted file, so you get no stack trace.
You have a few options:
contact the third-party developers and ask them "how can we get stack traces in this situation?"
dump the contents of the region with GDB dump command:
(gdb) dump /tmp/gly.so 0x9f488000 0x9f48a000
If you are lucky, the resulting binary would actually be an ELF (it doesn't have to be), and may have symbols and unwind descriptors in it. Use readelf --all /tmp/gly.so to look inside.
If it is an ELF file, you can let GDB know that that's what's mapped at 0x9f488000. You'll need to find the address ($tstart below) of .text section in it (should be in readelf output), then:
(gdb) add-symbol-file /tmp/gly.so 0x9f488000+$tstart

ptrace: get imagebase of tracee?

I am on ubuntu 13.10 and have this little stripped+packed elf file. I need to dump various pieces of information from its process in an automated way, so i hacked together a tiny tracer that traces my progress, similar to strace. Three questions arose:
1) after attaching to my process, how can i get it's imagebase?
2) where does the process break first? Apparently it is not the EP of the program.
3) any way i can be notified when a .so/.lib file is loaded? GDB can do this somehow, i think.
The first question really is the most important one. Any help is appreciated.
1) /proc/<PID>/maps contains list of everything the process mapped and from where, including pages mapped from an executable. By reading executable ELF headers you should be able to figure out where .text is.
2) Execution of dynamically linked binary typically starts with an interpreter. INTERP program header in an ELF executable (dump with readelf -e) will have its name. It's interpreter's entry point where execution starts. Typically it's a runtime linker ld-<some-variant>.so. It maps in executable's sections and may also map required shared libraries.
3) GDB has fairly detailed knowledge how runtime linker is implemented so it's able to intercept dynamic object loading by setting breakpoints in the right places. You can do the same. dlopen() seems like a good candidate for an interception point. As I noted in #2, shared objects may have been pre-loaded before the executable gets control.

Is a core dump executable by itself?

The Wikipedia page on Core dump says
In Unix-like systems, core dumps generally use the standard executable
image-format:
a.out in older versions of Unix,
ELF in modern Linux, System V, Solaris, and BSD systems,
Mach-O in OS X, etc.
Does this mean a core dump is executable by itself? If not, why not?
Edit: Since #WumpusQ.Wumbley mentions a coredump_filter in a comment, perhaps the above question should be: can a core dump be produced such that it is executable by itself?
In older unix variants it was the default to include the text as well as data in the core dump but it was also given in the a.out format and not ELF. Today's default behavior (in Linux for sure, not 100% sure about BSD variants, Solaris etc.) is to have the core dump in ELF format without the text sections but that behavior can be changed.
However, a core dump cannot be executed directly in any case without some help. The reason for that is that there are two things missing from a simple core file. One is the entry point, the other is code to restore the CPU state to the state at or just before the dump occurred (by default also the text sections are missing).
In AIX there used to be a utility called undump but I have no idea what happened to it. It doesn't exist in any standard Linux distribution I know of. As mentioned above (#WumpusQ) there's also an attempt at a similar project for Linux mentioned in above comments, however this project is not complete and doesn't restore the CPU state to the original state. It is, however, still good enough in some specific debugging cases.
It is also worth mentioning that there exist other ELF formatted files that cannot be executes as well which are not core files. Such as object files (compiler output) and .so (shared object) files. Those require a linking stage before being run to resolve external addresses.
I emailed this question the creator of the undump utility for his expertise, and got the following reply:
As mentioned in some of the answers there, it is possible to include
the code sections by setting the coredump_filter, but it's not the
default for Linux (and I'm not entirely sure about BSD variants and
Solaris). If the various code sections are saved in the original
core-dump, there is really nothing missing in order to create the new
executable. It does, however, require some changes in the original
core file (such as including an entry point and pointing that entry
point to code that will restore CPU registers). If the core file is
modified in this way it will become an executable and you'll be able
to run it. Unfortunately, though, some of the states are not going to
be saved so the new executable will not be able to run directly. Open
files, sockets, pips, etc are not going to be open and may even point
to other FDs (which could cause all sorts of weird things). However,
it will most probably be enough for most debugging tasks such running
small functions from gdb (so that you don't get a "not running an
executable" stuff).
As other guys said, I don't think you can execute a core dump file without the original binary.
In case you're interested to debug the binary (and it has debugging symbols included, in other words it is not stripped) then you can run gdb binary core.
Inside gdb you can use bt command (backtrace) to get the stack trace when the application crashed.

how to generate a stack trace from a core dump file in C, without invoking an external tool such as gdb

I am looking for a simple way to pull the stack trace out of a Linux core dump file programmatically, without having to invoke gdb. Anybody has an idea?
To avoid confusion: I am not looking for a way to get my own back trace from inside a process. I am looking for a way to get a backtrace out of a completely independent core dump file I have.
If you really can't invoke gdb, but want a backtrace like the ones it provides, you could just copy the bits of gdb's source that are needed for that into your project. Obviously just invoking gdb will be easier, more maintainable, and less eyebrow-raising, so maybe you should just do that.

Resources