Is a core dump executable by itself? - linux

The Wikipedia page on Core dump says
In Unix-like systems, core dumps generally use the standard executable
image-format:
a.out in older versions of Unix,
ELF in modern Linux, System V, Solaris, and BSD systems,
Mach-O in OS X, etc.
Does this mean a core dump is executable by itself? If not, why not?
Edit: Since #WumpusQ.Wumbley mentions a coredump_filter in a comment, perhaps the above question should be: can a core dump be produced such that it is executable by itself?

In older unix variants it was the default to include the text as well as data in the core dump but it was also given in the a.out format and not ELF. Today's default behavior (in Linux for sure, not 100% sure about BSD variants, Solaris etc.) is to have the core dump in ELF format without the text sections but that behavior can be changed.
However, a core dump cannot be executed directly in any case without some help. The reason for that is that there are two things missing from a simple core file. One is the entry point, the other is code to restore the CPU state to the state at or just before the dump occurred (by default also the text sections are missing).
In AIX there used to be a utility called undump but I have no idea what happened to it. It doesn't exist in any standard Linux distribution I know of. As mentioned above (#WumpusQ) there's also an attempt at a similar project for Linux mentioned in above comments, however this project is not complete and doesn't restore the CPU state to the original state. It is, however, still good enough in some specific debugging cases.
It is also worth mentioning that there exist other ELF formatted files that cannot be executes as well which are not core files. Such as object files (compiler output) and .so (shared object) files. Those require a linking stage before being run to resolve external addresses.

I emailed this question the creator of the undump utility for his expertise, and got the following reply:
As mentioned in some of the answers there, it is possible to include
the code sections by setting the coredump_filter, but it's not the
default for Linux (and I'm not entirely sure about BSD variants and
Solaris). If the various code sections are saved in the original
core-dump, there is really nothing missing in order to create the new
executable. It does, however, require some changes in the original
core file (such as including an entry point and pointing that entry
point to code that will restore CPU registers). If the core file is
modified in this way it will become an executable and you'll be able
to run it. Unfortunately, though, some of the states are not going to
be saved so the new executable will not be able to run directly. Open
files, sockets, pips, etc are not going to be open and may even point
to other FDs (which could cause all sorts of weird things). However,
it will most probably be enough for most debugging tasks such running
small functions from gdb (so that you don't get a "not running an
executable" stuff).

As other guys said, I don't think you can execute a core dump file without the original binary.
In case you're interested to debug the binary (and it has debugging symbols included, in other words it is not stripped) then you can run gdb binary core.
Inside gdb you can use bt command (backtrace) to get the stack trace when the application crashed.

Related

Program that runs on windows and linux

Is it possible to write a program (make executable) that runs on windows and linux without any interpreters?
Will it be able to take input and print output to console?
A program that runs directly on hardware, pure machine code as this should be possible in theory
edit:
Ok, file formats are different, system calls are different
But how hard or is it possible for kernel developers to introduce another executable format called raw for fun and science? Maybe raw program wont be able to report back but it should be able to inflict heavy load on cpu and raise its temperature as evidence of running for example
Is it possible to write a program (make executable) that runs on windows and linux without any interpreters?
in practice, no !
Levine's book Linkers and loaders explain why it is not possible in practice.
On recent Linux, an executable has the elf(5) format.
On Windows, it has some PE format.
The very first bytes of executables are different. And these two OSes have different system calls. The Linux ones are listed in syscalls(2).
And even on Linux, in practice, an executable is usually dynamically linked and depends on shared objects (and they are different from one distribution to the next one, so it is likely that an executable built for Debian/Testing won't run on Redhat). You could use the objdump(1), readelf(1), ldd(1) commands to inspect it, and strace(1) with gdb(1) to observe its runtime behavior.
Portability of software is often achieved by publishing it (in source form) with some open source license. The burden of recompilation is then on the shoulders of users.
In practice, real software (in particular those with a graphical user interface) depends on lots of OS specific and computer specific resources (e.g. fonts, screen size, colors) and user preferences.
A possible approach could be to have a small OS specific software base which generate machine code at runtime, like e.g. SBCL or LuaJit does. You could also consider using asmjit. Another approach is to generate opaque or obfuscated C or C++ code at runtime, compile it (with the system compiler), and load it -at runtime- as a plugin. On Linux, use dlopen(3) with dlsym(3).
Pitrat's book: Artificial Beings, the conscience of a conscious machine describes a software system (some artificial mathematician) which generates all of its C source code (half a million lines). Contact me by email to basile#starynkevitch.net for more.
The Wine emulator allows you to run some (but not all) simple Windows executables on Linux. The WSL layer is rumored to enable you to run some Linux executable on Windows.
PS. Even open source projects like RefPerSys or GCC or Qt may be (and often are) difficult to build.
No, mainly because executable formats are different, but...
With some care, you can use mostly the same code to create different executables, one for Linux and another one for windows. Depending on what you consider an interpreter Java also runs on both Windows and Linux (in a Java Virtual Machine though).
Also, it is possible to create scripts that can be interpreted both by PowerShell and by the Bash shell, such that running one of these scripts could launch a proper application compiled for the OS of the user.
You might require the windows user to run on WSL, which is maybe an ugly workaround but allows you to have the same executable for both Windows and Linux users.

Okay to call a file core.foo?

On Unix-like operating systems, one should never call a file core because it might be overwritten by a core dump, so that name is de facto reserved for use by the operating system.
But what about with an extension, like core.c? It seems to me this should be perfectly okay; core and core.c are two distinct names.
Is the above correct, and names like core.foo are okay? Or am I missing anything?
On Unix-like operating systems,
Ie. on a system conforming to the single UNIX Specification (or just to POSIX). Or on a system belonging to the unix family.
one should never call a file core because it might be overwritten by a core dump, so that name is de facto reserved for use by the operating system.
No. I do not think there is a specification that specifies that core dumps are going into a file named "core". It's all "implementation defined", from posix:
3.117 Core File
A file of unspecified format that may be generated when a process terminates abnormally.
If coredump is created, where is it created, how and what the contents are, it's all up to implementation. [Freebsd creates a file named executable_name.core for ages. So not "on unix-like operating systems". Your sentence could be made valid by changing the beginning to:
On a linux system with kernel version lower then 2.6 or 2.4.21 one should never call a file "core", because...
Kernel 2.6 and 2.4.21 is more then 15 years old. Newer kernels have /proc/sys/kernel/core_pattern that allow specifying the filename and location and even a process to run on a coredump. So if you are working with such archaic kernel version lower then 2.6 or 2.4.21 (or the content of /proc/sys/kernel/core_pattern is naively set to core), then yes, you should be careful when creating a file named "core".
In today's world, that behavior doesn't matter at all, on most linux distributions systemd-coredump takes care of that and creates coredumps in /var/lib/systemd/coredump.
But what about with an extension, like core.c?
It's ok - core and core.c are different filenames. Dot is not a special character, it's just a character like anything else, core.c differs from core as much as core12 or coreAB does...
Is the above correct,
Yes.
and names like core.foo are okay?
Are okay.
Or am I missing anything?
No idea, but if you are, I surely hope you will find it.

How to automatically load a given so into any newly-started process under Linux?

Under Windows, there are several ways to automatically load a given dll into any newly-started process.
Is it possible to do the same thing under Linux?
Is it possible to do the same thing under Linux?
There is /etc/ld.so.preload, but that only works for dynamically-linked program binaries. Documentation here.
You also need to be extremely careful: if you specify something that can't be preloaded, you may make your system unbootable, or you may no longer be able to log in.

Accessing /proc

I'm currently developing an application which needs a lot of system and process information, some of which is only available through /proc, and I have some general questions about accessing the structures.
The application will be run on Linux (kernel >= 2.6), not on any other Unix-flavored OS. It should have access to any data in /proc, I can't say what is necessary now as the specifications are not clear yet, but the whole /proc directory is relevant to the application.
First of all: Is there a good documentation which covers all the features added / removed from kernel version to kernel version? One thing I'm curious about in particular is the format of the individual files. Can I take that for granted? Does it change among kernel versions?
Hooking up the parsing process based on the kernel wouldn't be a problem at all, it's just that I couldn't find any good docs on what has changed from version to version which could help me catching parsing errors in beforehand.
In addition: Is there a definite list of features that can be activated / deactivated by kernel options (except of course the /proc-feature itself)? I'm looking for a list of files / directories that only exist with the appropriate options being set in the kernel.
As an example of what I'm thinking of, this is a link to the proc manpage (http://linux.die.net/man/5/proc) which includes a lot of good information, e.g. some options include the earliest kernel version they were available at, some include whether a module is necessary to be loaded. This does not describe the output format of all information though, which is something I need if I want to parse it (e.g. if it is consistent throughout all kernel versions or changed at some point).
The second thing I'm wondering about is what happens if the process queried dies while being queried. What is my time interval? For example if I'm going to fetch a list of processes reading all the structures, and parse them one after another, what happens if my process x dies before I get to read it? Even if I check if the directory exists, it could still be gone one application call later.
Last but not least: Is there any major distribution out there that is not mounting proc?
From what I understand, a lot of common tools are based on the /proc interface such as lsmod or free, so I'm guessing that I can expect /proc to exist almost always.
The /proc interfaces are pretty stable (unlike the /sys interfaces), even if nothing is guaranteed. Almost all changes are backwards compatible, at least if they've been around for a few versions. You should
stick to the documented interfaces to be safe. If a file exists, its format may be extended in later versions, but normally in a backwards compatible way, e.g. adding columns to a table. The parts that are most at risk of disappearing are parts concerning hardware susbystems such as ACPI or SCSI, which are migrating to /sys (with a long transition period when both exist).
Most of the information is architecture-independent, except for hardware information (e.g. /proc/cpuinfo has very different fields on different architectures).
The main documentation is Documentation/filesystems/proc.txt in the kernel source. Consider proc(5) to be the overview and proc.txt to be the fine details. The kernel documentation is often incomplete, so don't be surprised if you need to resort to reading the source sometimes.
Most optional parts of /proc are activated by default if the driver whose data it exposes is included in the kernel. The exceptions are mostly related to hardware features that rarely need to be accessed from outside the kernel; if you need to access these features, you're probably already expecting to need to dig deeper. Look through Kconfig files in the kernel source for detailed information.
Process data (or hardware data related to removable hardware or provided by unloadable modules) can disappear under your nose. Most files under /proc can be read atomically, with a single read call with a reasonably-sized buffer; if you perform multiple read calls in sequence, drivers are supposed to guarantee that you get well-formed data. There is no way to guarantee atomicity between reads of separate files; if you're reading information about a process, this process can die at any time, and in principle could even be replaced by another process with the same PID before you're finished.
As it says in the description of /proc, “everyone should say Y here”. All desktop/server Linux systems and most embedded Linux systems must have /proc; a lot of things, including ps and other process management commands, many filesystem and device-related tools, and module loading, require it. The only systems that might be able to dispense with /proc are very small single-purpose embedded systems that support a single hardware configuration and run a fixed set of programs. You can count on its being here.

How to increase probability of Linux core dumps matching symbols?

I have a very complex cross-platform application. Recently my team and I have been running stress tests and have encountered several crashes (and core dumps accompanying them). Some of these core dumps are very precise, and show me the exact location where the crash occurred with around 10 or more stack frames. Others sometimes have just one stack frame with ?? being the only symbol!
What I'd like to know is:
Is there a way to increase the probability of core dumps pointing in the right direction?
Why isn't the number of stack frames reported consistent?
Any best practice advise for managing core dumps.
Here's how I compile the binaries (in release mode):
Compiler and platform: g++ with glibc-2.3.2-95.50 on CentOS 3.6 x86_64 -- This helps me maintain compatibility with older versions of Linux.
All files are compiled with the -g flag.
Debug symbols are stripped from the final binary and saved in a separate file.
When I have a core dump, I use GDB with the executable which created the core, and the symbols file. GDB never complains that there's a mismatch between the core/binary/symbols.
Yet I sometimes get core dumps with no symbols at all! It's understandable that I'm linking against non-debug version of libstdc++ and libgcc, but it would be nice if at least the stack trace shows me where in my code did the faulty instruction call originate (although it may ultimately end in ??).
Others sometimes have just one stack frame with "??" being the only symbol!
There can be many reasons for that, among others:
the stack frame was trashed (overwritten)
EBP/RBP (on x86/x64) is currently not holding any meaningful value — this can happen e.g. in units compiled with -fomit-frame-pointer or asm units that do so
Note that the second point may occur simply by, for example, glibc being compiled in such a way. Having the debug info for such system libraries installed could mitigate this (something like what the glibc-debug{info,source} packages are on openSUSE).
gdb has more control over the program than glibc, so glibc's backtrace call would naturally be unable to print a backtrace if gdb cannot do so either.
But shipping the source would be much easier :-)
As an alternative, on a glibc system, you could use the backtrace function call (or backtrace_symbols or backtrace_symbols_fd) and filter out the results yourself, so only the symbols belonging to your own code are displayed. It's a bit more work, but then, you can really tailor it to your needs.
Have you tried installing debugging symbols of the various libraries that you are using? For example, my distribution (Ubuntu) provides libc6-dbg, libstdc++6-4.5-dbg, libgcc1-dbg etc.
If you're building with optimisation enabled (eg. -O2), then the compiler can blur the boundary between stack frames, for example by inlining. I'm not sure that this would cause backtraces with just one stack frame, but in general the rule is to expect great debugging difficulty since the code you are looking it in the core dump has been modified and so does not necessarily correspond to your source.

Resources