What is a memory image in *nix systems? - linux

In the book Advanced Programming in the Unix Environment 3rd Edition, Chapter 10 -- Signals, Page 315, when talking about the actions taken by the processes that receive a signal , the author says
When the default action is labeled "terminate+core", it means that a memory image of the process is left in the file named core of the current working directory of the process.
What is a memory image? When is this created, what's the content of it, and what is it used for?

A memory image is simply a copy of the process's virtual memory, saved in a file. It's used when debugging the program, as you can examine the values of the program's variables and determine which functions were being called at the time of the failure.
As the documentation you quoted says, this file is created when the process is terminated due to a signal that has the "terminate+core" default action.'

A memory image is often called a core image. See core(5) and the core dump wikipage.
Grossly speaking, a core image describes the process virtual address space (and content) at time of crash (including call stacks of each active thread and writable data segments for global data and heaps, but often excluding text or code segments which are read-only and given in the executable ELF file or in shared libraries). It also contains the register state (for each thread).
The name core is understandable only by old guys like me (having seen computers built in the 1960 & 1970-s like IBM/360, PDP-10 and early PDP-11, both used for developing the primordial Unix), since long time ago (1950-1970) random access memory was made with magnetic core memory.
If you have compiled all your source code with debug information (e.g. using gcc -g -Wall) you can do some post-mortem debugging (after yourprogram crashed and dumped a core file!) using gdb as
gdb yourprogram core
and the first gdb command you'll try is probably bt to get the backtrace.
Don't forget to enable core dumps, with the setrlimit(2) syscall generally done in your shell with e.g. ulimit  -c
Several signals can dump core, see signal(7). A common cause is a segmentation violation, like when you dereference a NULL or bad pointer, which gives a SIGSEGV signal which (often) dumps a core file in the current directory.
See also gcore(1).

Related

Core files generated by linux kernel modules

I am trying to load a kernel module (out-of-tree) and dmesg shows a panic. The kernel is still up though. I guess the module panic'd.
Where to find the core file? I want to use gdb and see whats the problem.
Where to find the core file?
Core files are strictly a user-space concept.
I want to use gdb and see whats the problem.
You may be looking for KGDB and/or Kdump/Kexec.
Normally, whenever the coredump was generated, it will state "core dumped". This could be one high level easy way to confirm whether coredump got generated however, this statement alone cannot guarantee on coredump file availability. The location where coredump is generated is specified through core_pattern to kernel via sysctl. You need to check the information present in core_pattern of your system. Also, note that in case of Ubuntu, it appears that the coredump file size is kept as zero by default which will avoid generation of coredump. So, you might need to check the corefile size ulimit and change it to 'ulimit -c unlimited', if it is zero. The manpage http://man7.org/linux/man-pages/man5/core.5.html explains about various reasons due to which coredump shall not get generated.
However, from your explanation, it appears that you are facing 'kernel oops' as the kernel is still up(unstable state) even though a particular module got panic'd/killed. In such cases, kernel shall print an oops message. Refer to link https://www.kernel.org/doc/Documentation/oops-tracing.txt that has information regarding the kernel oops messages.
Abstract from the link: Normally the Oops text is read from the kernel buffers by klogd and
handed to syslogd which writes it to a syslog file, typically
/var/log/messages (depends on /etc/syslog.conf). Sometimes klogd
dies, in which case you can run dmesg > file to read the data from the
kernel buffers and save it. Or you can cat /proc/kmsg > file, however
you have to break in to stop the transfer, kmsg is a "never ending
file".
printk is used for generating the oops messages. printk does tagging of severity by means of different loglevels /priorities and allows the classification of messages according to their severity. (Different priorities are defined in file linux/kernel.h or linux/kern_levels.h, in form of macros like KERN_EMERG, KERN_ALERT, KERN_CRIT etc..)So, you may need to check the default logging levels in system by using cat /proc/sys/kernel/printk and change it as per your requirement. Also, check whether the logging daemons are up and incase you want to debug kernel, ensure that the kernel is compiled with CONFIG_DEBUG_INFO.
The method to use GDB to find the location where the kernel panicked or oopsed in ubuntu is in the link https://wiki.ubuntu.com/Kernel/KernelDebuggingTricks which can be one of the method that can be used by you for debugging kernel oops.
there won't be a core file.
You should follow the stack trace in kernel messages. type dmesg to see it.

Linux kernel loading process

I'm reading about the Linux kernel loading process (just to understand the whole sequence) and I have several doubts specially about the control transition between:
The boot-loader and the kernel
The kernel and the init process
For example, in the wikipedia I found the following:
The kernel as loaded is typically an image file, compressed into either zImage or bzImage formats with zlib. A routine at the head of it does a minimal amount of hardware setup, decompresses the image fully into high memory, and takes note of any RAM disk if configured.[3] It then executes kernel startup via ./arch/i386/boot/head and the startup_32 ()
Here I have several questions:
What this routine stands for?
In which part of the memory is loaded?
Does it already include code to decompress the zImage or this code is loaded separately in another memory location?
I Continue reading on the same page and I found the following:
... start_kernel executes a wide range of initialization functions. It sets up interrupt handling (IRQs), further configures memory, starts the Init process (the first user-space process), ...
I know that the init is the first user-space process created. The answer to the following question:
How the init process is started in linux kernel?
states that the kernel uses a do_execve() call. However, the semantic for the normal execv system call is to override the calling process (the kernel in this case?) bss, data, text and stack segments with the ones from the new process and it doesn't return.
Why in this case it does return? (otherwise, if it doesn't return the kernel wont continue it's starting process)
Thanks in advance,

Generating core dumps

From times to times my Go program crashes.
I tried a few things in order to get core dumps generated for this program:
defining ulimit on the system, I tried both ulimit -c unlimited and ulimit -c 10000 just in case. After launching my panicking program, I get no core dump.
I also added recover() support in my program and added code to log to syslog in case of panic but I get nothing in syslog.
I am running out of ideas right now.
I must have overlooked something but I do not find what, any help would be appreciated.
Thanks ! :)
Note that a core dump is generated by the OS when a condition from a certain set is met. These conditions are pretty low-level — like trying to access unmapped memory or trying to execute an opcode the CPU does not know etc. Under a POSIX operating system such as Linux when a process does one of these things, an appropriate signal is sent to it, and some of them, if not handled by the process, have a default action of generating a core dump, which is done by the OS if not prohibited by setting a certain limit.
Now observe that this machinery treats a process on the lowest possible level (machine code), but the binaries a Go compiler produces are more higher-level that those a C compiler (or assembler) produces, and this means certain errors in a process produced by a Go compiler are handled by the Go runtime rather than the OS. For instance, a typical NULL pointer dereference in a process produced by a C compiler usually results in sending the process the SIGSEGV signal which is then typically results in an attempt to dump the process' core and terminate it. In contrast, when this happens in a process compiled by a Go compiler, the Go runtime kicks in and panics, producing a nice stack trace for debugging purposes.
With these facts in mind, I would try to do this:
Wrap your program in a shell script which first relaxes the limit for core dumps (but see below) and then runs your program with its standard error stream redirected to a file (or piped to the logger binary etc).
The limits a user can tweak have a hierarchy: there are soft and hard limits — see this and this for an explanation. So try checking your system does not have 0 for the core dump size set as a hard limit as this would explain why your attempt to raise this limit has no effect.
At least on my Debian systems, when a program dies due to SIGSEGV, this fact is logged by the kernel and is visible in the syslog log files, so try grepping them for hints.
First, please make sure all errors are handled.
For core dump, you can refer generate a core dump in linux
You can use supervisor to reboot the program when it crashes.

How do I ensure my Linux program doesn't produce core dumps?

I've got a program that keeps security-sensitive information (such as private keys) in memory, since it uses them over the lifetime of the program. Production versions of this program set RLIMIT_CORE to 0 to try to ensure that a core dump that might contain this sensitive information is never produced.
However, while this isn't mentioned in the core(8) manpage, the apport documentation on the Ubuntu wiki claims,
Note that even if ulimit is set to disabled core files (by specyfing a
core file size of zero using ulimit -c 0), apport will still capture
the crash.
Is there a way within my process (i.e., without relying on configuration of the system external to it) that I can ensure that a core dump of my process is never generated?
Note: I'm aware that there are plenty of methods (such as those mentioned in the comments below) where a user with root or process owner privileges could still access the sensitive data. What I'm aiming at here is preventing unintentional exposure of the sensitive data through it being saved to disk, being sent to the Ubuntu bug tracking system, or things like that. (Thanks to Basile Starynkevitch for making this explicit.)
According to the POSIX spec, core dumps only happen in response to signals whose action is the default action and whose default action is to "terminate the process abnormally with additional actions".
So, if you scroll down to the list in the description of signal.h, everything with an "A" in the "Default Action" column is a signal you need to worry about. Use sigaction to catch all of them and just call exit (or _exit) in the signal handler.
I believe these are the only ways POSIX lets you generate a core dump. Conceivably, Linux might have other "back doors" for this purpose; unfortunately, I am not enough of a kernel expert to be sure...

Minimal core dump (stack trace + current frame only)

Can I configure what goes into a core dump on Linux? I want to obtain something like the Windows mini-dumps (minimal information about the stack frame when the app crashed). I know you can set a max size for the core files using ulimit, but this does not allow me to control what goes inside the core (i.e. there is no guarantee that if I set the limit to 64kb it will dump the last 16 pages of the stack, for example).
Also, I would like to set it in a programmatic way (from code), if possible.
I have looked at the /proc/PID/coredump_filter file mentioned by man core, but it seems too coarse grained for my purposes.
To provide a little context: I need tiny core files, for multiple reasons: I need to collect them over the network, for numerous (thousands) of clients; furthermore, these are embedded devices with little SD cards, and GPRS modems for the network connection. So anything above ~200k is out of question.
EDIT: I am working on an embedded device which runs linux 2.6.24. The processor is PowerPC. Unfortunately, powerpc-linux is not supported in breakpad at the moment, so google breakpad is not an option
I have "solved" this issue in two ways:
I installed a signal handler for SIGSEGV, and used backtrace/backtrace_symbols to print out the stack trace. I compiled my code with -rdynamic, so even after stripping the debug info I still get a backtrace with meaningful names (while keeping the executable compact enough).
I stripped the debug info and put it in a separate file, which I will store somewhere safe, using strip; from there, I will use add22line with the info saved from the backtrace (addresses) to understand where the problem happened. This way I have to store only a few bytes.
Alternatively, I found I could use the /proc/self/coredump_filter to dump no memory (setting its content to "0"): only thread and proc info, registers, stacktrace etc. are saved in the core. See more in this answer
I still lose information that could be precious (global and local variable(s) content, params..). I could easily figure out which page(s) to dump, but unfortunately there is no way to specify a "dump-these-pages" for normal core dumps (unless you are willing to go and patch the maydump() function in the kernel).
For now, I'm quite happy with there 2 solutions (it is better than nothing..) My next moves will be:
see how difficult would be to port Breakpad to powerpc-linux: there are already powerpc-darwin and i386-linux so.. how hard can it be? :)
try to use google-coredumper to dump only a few pages around the current ESP (that should give me locals and parameters) and around "&some_global" (that should give me globals).

Resources