I am on ubuntu 13.10 and have this little stripped+packed elf file. I need to dump various pieces of information from its process in an automated way, so i hacked together a tiny tracer that traces my progress, similar to strace. Three questions arose:
1) after attaching to my process, how can i get it's imagebase?
2) where does the process break first? Apparently it is not the EP of the program.
3) any way i can be notified when a .so/.lib file is loaded? GDB can do this somehow, i think.
The first question really is the most important one. Any help is appreciated.
1) /proc/<PID>/maps contains list of everything the process mapped and from where, including pages mapped from an executable. By reading executable ELF headers you should be able to figure out where .text is.
2) Execution of dynamically linked binary typically starts with an interpreter. INTERP program header in an ELF executable (dump with readelf -e) will have its name. It's interpreter's entry point where execution starts. Typically it's a runtime linker ld-<some-variant>.so. It maps in executable's sections and may also map required shared libraries.
3) GDB has fairly detailed knowledge how runtime linker is implemented so it's able to intercept dynamic object loading by setting breakpoints in the right places. You can do the same. dlopen() seems like a good candidate for an interception point. As I noted in #2, shared objects may have been pre-loaded before the executable gets control.
Related
addr2line translates addresses into file names and line numbers. I am still beginner in debugging, and have some questions about addr2line.
If am debugging a certain .so (binary) file, how the tool can locate
its source code file (from where can get it!), what if the source doesn't exist?
What is the relation between the address in a binary and the line
number in its source, so addr2line can do this kind of mapping?
In general, addr2line works best on ELF executables or shared libraries with debug information. That debug information is emitted by the compiler when you pass -g (or -g2, etc...) to GCC. It notably provides a mapping between source code location (name of source file, line number, column number) and functions, variable names, call stack frame organization, etc etc... The debug information is today in DWARF format (and is also processed by the gdb debugger, the libbacktrace library, etc etc...). Notice that the debug information contains source file paths (not the source file itself).
In practice, you can (and often should) pass the -g (or -g2) debugging option to GCC even with optimization flags like -O2. In that case, the debug information is slightly less precise but still practically usable. In some cases, stack frames may disappear (inlined function calls, tail call optimizations, ....).
You could use the strip(1) utility to remove debug information (and other symbol tables, etc) from some ELF executable.
The Wikipedia page on Core dump says
In Unix-like systems, core dumps generally use the standard executable
image-format:
a.out in older versions of Unix,
ELF in modern Linux, System V, Solaris, and BSD systems,
Mach-O in OS X, etc.
Does this mean a core dump is executable by itself? If not, why not?
Edit: Since #WumpusQ.Wumbley mentions a coredump_filter in a comment, perhaps the above question should be: can a core dump be produced such that it is executable by itself?
In older unix variants it was the default to include the text as well as data in the core dump but it was also given in the a.out format and not ELF. Today's default behavior (in Linux for sure, not 100% sure about BSD variants, Solaris etc.) is to have the core dump in ELF format without the text sections but that behavior can be changed.
However, a core dump cannot be executed directly in any case without some help. The reason for that is that there are two things missing from a simple core file. One is the entry point, the other is code to restore the CPU state to the state at or just before the dump occurred (by default also the text sections are missing).
In AIX there used to be a utility called undump but I have no idea what happened to it. It doesn't exist in any standard Linux distribution I know of. As mentioned above (#WumpusQ) there's also an attempt at a similar project for Linux mentioned in above comments, however this project is not complete and doesn't restore the CPU state to the original state. It is, however, still good enough in some specific debugging cases.
It is also worth mentioning that there exist other ELF formatted files that cannot be executes as well which are not core files. Such as object files (compiler output) and .so (shared object) files. Those require a linking stage before being run to resolve external addresses.
I emailed this question the creator of the undump utility for his expertise, and got the following reply:
As mentioned in some of the answers there, it is possible to include
the code sections by setting the coredump_filter, but it's not the
default for Linux (and I'm not entirely sure about BSD variants and
Solaris). If the various code sections are saved in the original
core-dump, there is really nothing missing in order to create the new
executable. It does, however, require some changes in the original
core file (such as including an entry point and pointing that entry
point to code that will restore CPU registers). If the core file is
modified in this way it will become an executable and you'll be able
to run it. Unfortunately, though, some of the states are not going to
be saved so the new executable will not be able to run directly. Open
files, sockets, pips, etc are not going to be open and may even point
to other FDs (which could cause all sorts of weird things). However,
it will most probably be enough for most debugging tasks such running
small functions from gdb (so that you don't get a "not running an
executable" stuff).
As other guys said, I don't think you can execute a core dump file without the original binary.
In case you're interested to debug the binary (and it has debugging symbols included, in other words it is not stripped) then you can run gdb binary core.
Inside gdb you can use bt command (backtrace) to get the stack trace when the application crashed.
On Linux you can inspect /proc/$PID/pmaps to see the libraries loaded by a particular program, and a program can open /proc/self/pmaps to examine the libraries it itself has loaded.
I know pmaps will only contain dynamic libraries, and obviously the kernel can't predict which libraries we might dlopen at a later point, so I expect those aren't included in /proc/self/maps. But I'm unsure of a few other other scenarios:
Are libraries that have been linked at build time but we haven't called any function in yet included? My understanding is the Linux delays linking symbols until the first time they are used, so I'm not sure if they'll show up.
Does pmaps contain all the libraries used recursively? E.g. if I look at each library in pmaps and run ldd on it, and then run ldd on those, ad nauseum, I shouldn't find any new libraries that weren't in the original pmaps? I tried this on a couple binaries and it appears to be so but maybe I'm getting lucky.
Are libraries that have been linked at build time but we haven't called any function in yet included?
Yes: the runtime loader will mmap every library that your executable directly depends on, before your program starts running.
You can find the list of such libraries by running
readelf -d a.out | grep NEEDED
Does pmaps contain all the libraries used recursively?
Yes: if a library that you directly depend on itself depends on some other library, the runtime loader will mmap the recursive dependencies as well.
My understanding is the Linux delays linking symbols until the first time they are used
That is mosty correct for function symbols, but false for data symbols, which can't be resolved lazily.
Also, whether the symbols are resolved lazily or not depends on LD_BIND_NOW environment variable, and on an equivalent setting in the executable dynamic section, controlled by -znow linker flag.
None of that changes the mmap pciture though; if you have a DT_NEEDED entry for foo.so in your dynamic section, then foo.so will be mmaped (and will show in /proc/self/*map*) independent of lazy or non-lazy resolution.
/proc/$pid/maps is not only going to list libraries that are loaded, but also ALL other mapped memory segments.
Read this thread and the article in there:
Understanding Linux /proc/id/maps
I have a stripped ld.so that I want to replace with the unstripped version (so that valgrind works). I have ensured that I have the same version of glib and the cross compiler.
I have compiled the shared object, calling 'file' on it shows that it is compiled correctly (the only difference with the original being the 'unstripped' and being about 15% bigger). Unfortunately, it then causes a kernel panic (unable to init) on start up. Stripping the newly compiled .so, readelf-ing it and diff-ing it with the original, shows that there were extra symbols in the new version of the .so . All of the old symbols were still present, so what I don't understand is why the kernel panics with those extra symbols there.
I would expect the extra symbols to have no affect on the kernel start up, as they should never be called, so why do I get a kernel panic?
NB: To be clear - I will still need to investigate why there are extra symbols, but my question is about why these unused symbols cause problems.
The kernel (assuming Linux) doesn't depend on or use ld.so in any way, shape or form. The reason it panics is most likely that it can't exec any of the user-level programs (such as /bin/init and /bin/sh), which do use ld.so.
As to why your init doesn't like your new ld.so, it's hard to tell. One common mistake is to try to replace ld.so with the contents of /usr/lib/debug/ld-X.Y.so. While that file looks like it's not much different from the original /lib/ld-X.Y.so, it is in fact very different, and can't be used to replace the original, only to debug the original (/usr/lib/debug/ld-X.Y.so usually contains only debug sections, but none of code and data sections of /lib/ld-X.Y.so, and so attempts to run it usually cause immediate SIGSEGV).
Perhaps you can set up a chroot, mimicking your embedded environment, and run /bin/ls in it? The error (or a core dump) this will produce will likely tell you what's wrong with your ld.so.
Until now I've always used resources under MSVC++ to get access to raw data from inside of my programs and I've never worked with a linker directly, but now I'm under Linux and I'm using a cross-compiler to produce elf files. A friend and I are working on a toy OS.
One thing we need to get accomplished at some point is for a rather large piece of arbitrary raw data to be linked into the executable. We want the data to be located near the end of the executable and need to be able to get a pointer to that raw data as well. It's probably worth noting that GRUB is loading the kernel into memory at boot time.
One of our previous ideas was to just write a program to convert the data into a C source file where the data was represented as an array of bytes, but we figure that's a little bit messy and we'd rather have it linked in directly.
Any insights? I don't need the gruesome details just a broad overview of what needs to be done. I figure we probably have to make some changes to our linker script.
Take a look at calling objdump --add-section after you complete the link to add the arbitrary data to the ELF file.
Alternatively, if you are writing a kernel, you can do what Linux does to load an initrd and just have GRUB load your kernel and then load the data seperately to a known memory location and access it that way.