Why does a shared object fail if it has extra symbols compared to the original - shared-libraries

I have a stripped ld.so that I want to replace with the unstripped version (so that valgrind works). I have ensured that I have the same version of glib and the cross compiler.
I have compiled the shared object, calling 'file' on it shows that it is compiled correctly (the only difference with the original being the 'unstripped' and being about 15% bigger). Unfortunately, it then causes a kernel panic (unable to init) on start up. Stripping the newly compiled .so, readelf-ing it and diff-ing it with the original, shows that there were extra symbols in the new version of the .so . All of the old symbols were still present, so what I don't understand is why the kernel panics with those extra symbols there.
I would expect the extra symbols to have no affect on the kernel start up, as they should never be called, so why do I get a kernel panic?
NB: To be clear - I will still need to investigate why there are extra symbols, but my question is about why these unused symbols cause problems.

The kernel (assuming Linux) doesn't depend on or use ld.so in any way, shape or form. The reason it panics is most likely that it can't exec any of the user-level programs (such as /bin/init and /bin/sh), which do use ld.so.
As to why your init doesn't like your new ld.so, it's hard to tell. One common mistake is to try to replace ld.so with the contents of /usr/lib/debug/ld-X.Y.so. While that file looks like it's not much different from the original /lib/ld-X.Y.so, it is in fact very different, and can't be used to replace the original, only to debug the original (/usr/lib/debug/ld-X.Y.so usually contains only debug sections, but none of code and data sections of /lib/ld-X.Y.so, and so attempts to run it usually cause immediate SIGSEGV).
Perhaps you can set up a chroot, mimicking your embedded environment, and run /bin/ls in it? The error (or a core dump) this will produce will likely tell you what's wrong with your ld.so.

Related

How addr2line can locate the source file and the line of code?

addr2line translates addresses into file names and line numbers. I am still beginner in debugging, and have some questions about addr2line.
If am debugging a certain .so (binary) file, how the tool can locate
its source code file (from where can get it!), what if the source doesn't exist?
What is the relation between the address in a binary and the line
number in its source, so addr2line can do this kind of mapping?
In general, addr2line works best on ELF executables or shared libraries with debug information. That debug information is emitted by the compiler when you pass -g (or -g2, etc...) to GCC. It notably provides a mapping between source code location (name of source file, line number, column number) and functions, variable names, call stack frame organization, etc etc... The debug information is today in DWARF format (and is also processed by the gdb debugger, the libbacktrace library, etc etc...). Notice that the debug information contains source file paths (not the source file itself).
In practice, you can (and often should) pass the -g (or -g2) debugging option to GCC even with optimization flags like -O2. In that case, the debug information is slightly less precise but still practically usable. In some cases, stack frames may disappear (inlined function calls, tail call optimizations, ....).
You could use the strip(1) utility to remove debug information (and other symbol tables, etc) from some ELF executable.

ptrace: get imagebase of tracee?

I am on ubuntu 13.10 and have this little stripped+packed elf file. I need to dump various pieces of information from its process in an automated way, so i hacked together a tiny tracer that traces my progress, similar to strace. Three questions arose:
1) after attaching to my process, how can i get it's imagebase?
2) where does the process break first? Apparently it is not the EP of the program.
3) any way i can be notified when a .so/.lib file is loaded? GDB can do this somehow, i think.
The first question really is the most important one. Any help is appreciated.
1) /proc/<PID>/maps contains list of everything the process mapped and from where, including pages mapped from an executable. By reading executable ELF headers you should be able to figure out where .text is.
2) Execution of dynamically linked binary typically starts with an interpreter. INTERP program header in an ELF executable (dump with readelf -e) will have its name. It's interpreter's entry point where execution starts. Typically it's a runtime linker ld-<some-variant>.so. It maps in executable's sections and may also map required shared libraries.
3) GDB has fairly detailed knowledge how runtime linker is implemented so it's able to intercept dynamic object loading by setting breakpoints in the right places. You can do the same. dlopen() seems like a good candidate for an interception point. As I noted in #2, shared objects may have been pre-loaded before the executable gets control.

Is a core dump executable by itself?

The Wikipedia page on Core dump says
In Unix-like systems, core dumps generally use the standard executable
image-format:
a.out in older versions of Unix,
ELF in modern Linux, System V, Solaris, and BSD systems,
Mach-O in OS X, etc.
Does this mean a core dump is executable by itself? If not, why not?
Edit: Since #WumpusQ.Wumbley mentions a coredump_filter in a comment, perhaps the above question should be: can a core dump be produced such that it is executable by itself?
In older unix variants it was the default to include the text as well as data in the core dump but it was also given in the a.out format and not ELF. Today's default behavior (in Linux for sure, not 100% sure about BSD variants, Solaris etc.) is to have the core dump in ELF format without the text sections but that behavior can be changed.
However, a core dump cannot be executed directly in any case without some help. The reason for that is that there are two things missing from a simple core file. One is the entry point, the other is code to restore the CPU state to the state at or just before the dump occurred (by default also the text sections are missing).
In AIX there used to be a utility called undump but I have no idea what happened to it. It doesn't exist in any standard Linux distribution I know of. As mentioned above (#WumpusQ) there's also an attempt at a similar project for Linux mentioned in above comments, however this project is not complete and doesn't restore the CPU state to the original state. It is, however, still good enough in some specific debugging cases.
It is also worth mentioning that there exist other ELF formatted files that cannot be executes as well which are not core files. Such as object files (compiler output) and .so (shared object) files. Those require a linking stage before being run to resolve external addresses.
I emailed this question the creator of the undump utility for his expertise, and got the following reply:
As mentioned in some of the answers there, it is possible to include
the code sections by setting the coredump_filter, but it's not the
default for Linux (and I'm not entirely sure about BSD variants and
Solaris). If the various code sections are saved in the original
core-dump, there is really nothing missing in order to create the new
executable. It does, however, require some changes in the original
core file (such as including an entry point and pointing that entry
point to code that will restore CPU registers). If the core file is
modified in this way it will become an executable and you'll be able
to run it. Unfortunately, though, some of the states are not going to
be saved so the new executable will not be able to run directly. Open
files, sockets, pips, etc are not going to be open and may even point
to other FDs (which could cause all sorts of weird things). However,
it will most probably be enough for most debugging tasks such running
small functions from gdb (so that you don't get a "not running an
executable" stuff).
As other guys said, I don't think you can execute a core dump file without the original binary.
In case you're interested to debug the binary (and it has debugging symbols included, in other words it is not stripped) then you can run gdb binary core.
Inside gdb you can use bt command (backtrace) to get the stack trace when the application crashed.

How to build the elf interpreter (ld-linux.so.2/ld-2.17.so) as static library?

I apologize if my question is not precise because I don't have a lot
of Linux related experience. I'm currently building a Linux from
scratch (mostly following the guide at linuxfromscratch.org version
7.3). I ran into the following problem: when I build an executable it
gets a hardcoded path to something called ELF interpreter.
readelf -l program
shows something like
[Requesting program interpreter: /lib/ld-linux.so.2]
I traced this library ld-linux-so.2 to be part of glibc. I am not very
happy with this behaviour because it makes the binary very unportable
- if I change the location of /lib/ld-linux.so.2 the executable no
longer works and the only "fix" I found is to use the patchelf utility
from NixOS to change the hardcoded path to another hardcoded path. For
this reason I would like to link against a static version of the ld
library but such is not produced. And so this is my question, could
you please explain how could I build glibc so that it will produce a
static version of ld-linux.so.2 which I could later link to my
executables. I don't fully understand what this ld library does, but I
assume this is the part that loads other dynamic libraries (or at
least glibc.so). I would like to link my executables dynamically, but
I would like the dynamic linker itself to be statically built into
them, so they would not depend on hardcoded paths. Or alternatively I
would like to be able to set the path to the interpreter with
environment variable similar to LD_LIBRARY_PATH, maybe
LD_INTERPRETER_PATH. The goal is to be able to produce portable
binaries, that would run on any platform with the same ABI no matter
what the directory structure is.
Some background that may be relevant: I'm using Slackware 14 x86 to
build i686 compiler toolchain, so overall it is all x86 host and
target. I am using glibc 2.17 and gcc 4.7.x.
I would like to be able to set the path to the interpreter with environment variable similar to LD_LIBRARY_PATH, maybe LD_INTERPRETER_PATH.
This is simply not possible. Read carefully (and several times) the execve(2), elf(5) & ld.so(8) man pages and the Linux ABI & ELF specifications. And also the kernel code doing execve.
The ELF interpreter is responsible for dynamic linking. It has to be a file (technically a statically linked ELF shared library) at some fixed location in the file hierarchy (often /lib/ld.so.2 or /lib/ld-linux.so.2 or /lib64/ld-linux-x86-64.so.2)
The old a.out format from the 1990s had a builtin dynamic linker, partly implemented in old Linux 1.x kernel. It was much less flexible, and much less powerful.
The kernel enables, by such (in principle) arbitrary dynamic linker path, to have various dynamic linkers. But most systems have only one. This is a good way to parameterize the dynamic linker. If you want to try another one, install it in the file system and generate ELF executables mentioning that path.
With great pain and effort, you might make your own ld.so-like dynamic linker implementing your LD_INTERPRETER_PATH wish, but that linker still has to be an ELF shared library sitting at some fixed location in the file tree.
If you want a system not needing any files (at some predefined, and wired locations, like /lib/ld.so, /dev/null, /sbin/init ...), you'll need to build all its executable binaries statically. You may want (but current Linux distributions usually don't do that) to have a few statically linked executables (like /sbin/init, /bin/sash...) that will enable you to repair a system broken to the point of not having any dynamic linker.
BTW, the /sbin/init -or /bin/sh - path is wired inside the kernel itself. You may pass some argument to the kernel at boot load time -e.g. with GRUB- to overwrite the default. So even the kernel wants some files to be here!
As I commented, you might look into MUSL-Libc for an alternative Libc implementation (providing its own dynamic linker). Read also about VDSO and ASLR and initrd.
In practice, accept the fact that modern Linuxes and Unixes are expecting some non-empty file system ... Notice that dynamic linking and shared libraries are a huge progress (it was much more painful in the 1990s Linux kernels and distributions).
Alternatively, define your own binary format, then make a kernel module or a binfmt_misc entry to handle it.
BTW, most (or all) of Linux is free software, so you can improve it (but this will take months -or many years- of work to you). Please share your improvements by publishing them.
Read also Drepper's Hwo to Write Shared Libraries paper; and this question.
I ran into the same issue. In my case I want to bundle my application with a different GLIBC than comes system installed. Since ld-linux.so must match the GLIBC version I can't simply deploy my application with the according GLIBC. The problem is that I can't run my application on older installations that don't have the required GLIBC version.
The path to the loader interpreter can be modified with --dynamic-linker=/path/to/interp. However, this needs to be set at compile time and therefore would require my application to be installed in that location (or at least I would need to deploy the ld-linux.so that goes with my GLIBC in that location which goes against a simple xcopy deployment.
So what's needed is an $ORIGIN option equivalent to what the -rpath option can handle. That would allow for a fully dynamic deployment.
Given the lack of a dynamic interpreter path (at runtime) leaves two options:
a) Use patchelf to modify the path before the executable gets launched.
b) Invoke the ld-linux.so directly with the executable as an argument.
Both options are not as 'integrated' as a compiled $ORIGIN path in the executable itself.

Which libraries appear in /proc/$PID/pmaps?

On Linux you can inspect /proc/$PID/pmaps to see the libraries loaded by a particular program, and a program can open /proc/self/pmaps to examine the libraries it itself has loaded.
I know pmaps will only contain dynamic libraries, and obviously the kernel can't predict which libraries we might dlopen at a later point, so I expect those aren't included in /proc/self/maps. But I'm unsure of a few other other scenarios:
Are libraries that have been linked at build time but we haven't called any function in yet included? My understanding is the Linux delays linking symbols until the first time they are used, so I'm not sure if they'll show up.
Does pmaps contain all the libraries used recursively? E.g. if I look at each library in pmaps and run ldd on it, and then run ldd on those, ad nauseum, I shouldn't find any new libraries that weren't in the original pmaps? I tried this on a couple binaries and it appears to be so but maybe I'm getting lucky.
Are libraries that have been linked at build time but we haven't called any function in yet included?
Yes: the runtime loader will mmap every library that your executable directly depends on, before your program starts running.
You can find the list of such libraries by running
readelf -d a.out | grep NEEDED
Does pmaps contain all the libraries used recursively?
Yes: if a library that you directly depend on itself depends on some other library, the runtime loader will mmap the recursive dependencies as well.
My understanding is the Linux delays linking symbols until the first time they are used
That is mosty correct for function symbols, but false for data symbols, which can't be resolved lazily.
Also, whether the symbols are resolved lazily or not depends on LD_BIND_NOW environment variable, and on an equivalent setting in the executable dynamic section, controlled by -znow linker flag.
None of that changes the mmap pciture though; if you have a DT_NEEDED entry for foo.so in your dynamic section, then foo.so will be mmaped (and will show in /proc/self/*map*) independent of lazy or non-lazy resolution.
/proc/$pid/maps is not only going to list libraries that are loaded, but also ALL other mapped memory segments.
Read this thread and the article in there:
Understanding Linux /proc/id/maps

Resources