Does GNU time memory output account for child processes too? - linux

When running GNU time (/usr/bin/time) and checking for memory consumption, does its output account for the memory usage of the child processes of my target program?
Could not find anything in GNU's time manpage.

Yes.
You can easily check with:
$ /usr/bin/time -f '%M' sh -c 'perl -e "\$y=q{x}x(2*1024*1024)" & wait'
8132
$ /usr/bin/time -f '%M' sh -c 'perl -e "\$y=q{x}x(8*1024*1024)" & wait'
20648
GNU time is using the wait4 system call on Linux (via the wait3 glibc wrapper), and while undocumented, the resource usage it returns in the struct rusage also includes the descendands of the process waited for. You can look at the kernel implementation of wait4 in kernel/exit.c for all the details:
$ grep -C2 RUSAGE_BOTH include/uapi/linux/resource.h
#define RUSAGE_SELF 0
#define RUSAGE_CHILDREN (-1)
#define RUSAGE_BOTH (-2) /* sys_wait4() uses this */
#define RUSAGE_THREAD 1 /* only the calling thread */
FreeBSD and NetBSD also have a wait6 system call which returns separate info for the process waited for and for its descendants. They also clearly document that the rusage returned by wait3 and wait4 also includes grandchildren.

Related

Why strace -f can't trace the child progress after |?

I am trying to see what would happen about system call when I running one command, but it seems those command after | can't be shown? like:
strace -f cat a.txt| cat
It seems strace and -f perimeter can show the whole process. I think the last part is in the child progress created by fork. Why and how to make it?
From the strace manual (emphasis mine).
-f Trace child processes as they are created by
currently traced processes as a result of the fork(2),
vfork(2) and clone(2) system calls.
The traced process in your case is the first cat process. The second cat process is not a child of the first cat process. The fork is done by the shell.
One way to achieve what you want is to trace the shell:
strace -f bash -c "cat a.txt| cat"

How to monitor process status during process lifetime

I need to track the process status ps axf during executable lifetime.
Let's say I have executable main.exec and want to store into a file all subprocess which are called during main.exec execution.
$ main.exec &
$ echo $! # and redirect every ps change for PID $! in a file.
strace - trace system calls and signals
$ main.exec &
$ strace -f -p $! -o child.txt
-f Trace child processes as they are created by currently traced processes as a result of the fork(2), vfork(2) and clone(2) system calls. Note that -p PID -f will attach all threads of process PID if it is multi-threaded, not only thread with thread_id = PID.
If you can't recompile and instrument main.exec, ps in a loop is a simple option that may work for you:
while true; do ps --ppid=<pid> --pid=<pid> -o pid,ppid,%cpu,... >> mytrace.txt; sleep 0.2; done
Then parse the output accordingly.
top may also work, and can run in batch mode but not sure if you can get it to dynamically monitor child processes like ps. Don't think so.

Reading living process memory without interrupting it

I would like to explore the memory of a living process, and when I do so, the process must not get disturbed - so attaching gdb to the process (which would stop it) is not an option.
Therefore I would like to get this info from /proc/kcore (if you know of another way to do this please let me know).
So I made a little experiment. I created a file called TEST with only "EXTRATESTEXTRA" inside.
Then I opened it with less
$ less TEST
I got the PID of this process with
$ ps aux | grep TEST
user 7785 0.0 0.0 17944 992 pts/8 S+ 16:15 0:00 less TEST
user 7798 0.0 0.0 13584 904 pts/9 S+ 16:16 0:00 grep TEST
And then I used this script to create a dump of all files :
#!/bin/bash
grep rw-p /proc/$1/maps | sed -n 's/^\([0-9a-f]*\)-\([0-9a-f]*\) .*$/\1 \2/p' | while read start stop; do gdb --batch --pid $1 -ex "dump memory $1-$start-$stop.dump 0x$start 0x$stop"; done
(I found it on this site https://serverfault.com/questions/173999/dump-a-linux-processs-memory-to-file)
$ sudo ./dump_all_pid_memory.sh 7785
After this, I looked for "TRATESTEX" in all dumped files :
$ grep -a -o -e '...TRATESTEX...' ./*.dump
./7785-00624000-00628000.dump:HEXTRATESTEXTRA
./7785-00b8f000-00bb0000.dump:EXTRATESTEXTRA
./7785-00b8f000-00bb0000.dump:EXTRATESTEXTRA
So I concluded that there must be an occurance of this string somewhere between 0x00624000 and 0x00628000 .
Therefore I converted the offsets into decimal numbers and used dd to get the memory from /proc/kcore :
$ sudo dd if="/proc/kcore" of="./y.txt" skip="0" count="1638400" bs=1
To my surprise, the file y.txt was full of zeros (I didn't find the string I was looking for in it).
As a bonus surprise, I ran a simmilar test at the same time with a different test file and found that the other test string i was using
(both processes with less were running at the same time) should be found at the same location (the dumping and greping gave the same offset).
So there must be something I don't understand clearly.
Isn't the /proc/pid/maps supposed to show the offset of the memory (i.e. : if it would say "XXX" is at offset 0x10, another program could not be using the same offset am I right? - this is the source of my second surprise)
How can I read /proc/kmap to get the memory that belongs to a process which's pid I know ?
If you have root access and are on a linux system, you can use the following linux script (adapted from Gilles' excellent unix.stackexchange.com answer and the answer originally given in the question above but including SyntaxErrors and not being pythonic):
#!/usr/bin/env python
import re
import sys
def print_memory_of_pid(pid, only_writable=True):
"""
Run as root, take an integer PID and return the contents of memory to STDOUT
"""
memory_permissions = 'rw' if only_writable else 'r-'
sys.stderr.write("PID = %d" % pid)
with open("/proc/%d/maps" % pid, 'r') as maps_file:
with open("/proc/%d/mem" % pid, 'r', 0) as mem_file:
for line in maps_file.readlines(): # for each mapped region
m = re.match(r'([0-9A-Fa-f]+)-([0-9A-Fa-f]+) ([-r][-w])', line)
if m.group(3) == memory_permissions:
sys.stderr.write("\nOK : \n" + line+"\n")
start = int(m.group(1), 16)
if start > 0xFFFFFFFFFFFF:
continue
end = int(m.group(2), 16)
sys.stderr.write( "start = " + str(start) + "\n")
mem_file.seek(start) # seek to region start
chunk = mem_file.read(end - start) # read region contents
print chunk, # dump contents to standard output
else:
sys.stderr.write("\nPASS : \n" + line+"\n")
if __name__ == '__main__': # Execute this code when run from the commandline.
try:
assert len(sys.argv) == 2, "Provide exactly 1 PID (process ID)"
pid = int(sys.argv[1])
print_memory_of_pid(pid)
except (AssertionError, ValueError) as e:
print "Please provide 1 PID as a commandline argument."
print "You entered: %s" % ' '.join(sys.argv)
raise e
If you save this as write_mem.py, you can run this (with python2.6 or 2.7) or early in python2.5 (if you add from __future__ import with_statement) as:
sudo python write_mem.py 1234 > pid1234_memory_dump
to dump pid1234 memory to the file pid1234_memory_dump.
For process 1234 you can get its memory map by reading sequentially /proc/1234/maps (a textual pseudo-file) and read the virtual memory by e.g. read(2)-ing or mmap(2)-ing appropriate segments of the /proc/1234/mem sparse pseudo-file.
However, I believe you cannot avoid some kind of synchronization (perhaps with ptrace(2), as gdb does), since the process 1234 can (and does) alter its address space at any time (with mmap & related syscalls).
The situation is different if the monitored process 1234 is not arbitrary, but if you could improve it to communicate somehow with the monitoring process.
I'm not sure to understand why do you ask this. And gdb is able to watch some location without stopping the process.
Since the 3.2 version of the kernel. You can use the process_vm_readv system call to read process memory without interruption.
ssize_t process_vm_readv(pid_t pid,
const struct iovec *local_iov,
unsigned long liovcnt,
const struct iovec *remote_iov,
unsigned long riovcnt,
unsigned long flags);
These system calls transfer data between the address space of the
calling process ("the local process") and the process identified by
pid ("the remote process"). The data moves directly between the
address spaces of the two processes, without passing through kernel
space.
You'll have to use /proc/pid/mem to read process memory. I wouldn't recommend trying to read /proc/kcore or any of the kernel memory functions (which is time consuming).
If you just want to get the value of a global variable or a specified address, you can use my tool gvardump instead of reading the entire memory.
gvardump will parse the variable address and print its value nicely without causing process interruption.
For example:
root#ubuntu:/home/u/trace_test# ./gvardump.py 53670 -a 1 '*g_ss[0].sss[0].ps'
*((g_ss[0]).sss[0]).ps = {
.a = 6,
.sss = {
{
.bbb = 0,
.ps = 0x563ca42a2020,
.bs = {
.m = 0,
},
},
// other 9 elements are omit
},
...
and when I do so, the process must not get disturbed - so attaching gdb to the process (which would stop it) is not an option.
I have modified gdb to avoid attaching.
With this modified gdb, you can run gdb -m <PID> to explore the memory without stopping the process.
i achieved this by issuing the below command
[root#stage1 ~]# echo "Memory usage for PID [MySql]:"; for mem in {Private,Rss,Shared,Swap,Pss};do grep $mem /proc/ps aux |grep mysql |awk '{print $2}'|head -n 1/smaps | awk -v mem_type="$mem" '{i=i+$2} END {print mem_type,"memory usage:"i}' ;done
Result Output
Memory usage for PID [MySql]:
Private memory usage:204
Rss memory usage:1264
Shared memory usage:1060
Swap memory usage:0
Pss memory usage:423

Is there a way to figure out what is using a Linux kernel module?

If I load a kernel module and list the loaded modules with lsmod, I can get the "use count" of the module (number of other modules with a reference to the module). Is there a way to figure out what is using a module, though?
The issue is that a module I am developing insists its use count is 1 and thus I cannot use rmmod to unload it, but its "by" column is empty. This means that every time I want to re-compile and re-load the module, I have to reboot the machine (or, at least, I can't figure out any other way to unload it).
Actually, there seems to be a way to list processes that claim a module/driver - however, I haven't seen it advertised (outside of Linux kernel documentation), so I'll jot down my notes here:
First of all, many thanks for #haggai_e's answer; the pointer to the functions try_module_get and try_module_put as those responsible for managing the use count (refcount) was the key that allowed me to track down the procedure.
Looking further for this online, I somehow stumbled upon the post Linux-Kernel Archive: [PATCH 1/2] tracing: Reduce overhead of module tracepoints; which finally pointed to a facility present in the kernel, known as (I guess) "tracing"; the documentation for this is in the directory Documentation/trace - Linux kernel source tree. In particular, two files explain the tracing facility, events.txt and ftrace.txt.
But, there is also a short "tracing mini-HOWTO" on a running Linux system in /sys/kernel/debug/tracing/README (see also I'm really really tired of people saying that there's no documentation…); note that in the kernel source tree, this file is actually generated by the file kernel/trace/trace.c. I've tested this on Ubuntu natty, and note that since /sys is owned by root, you have to use sudo to read this file, as in sudo cat or
sudo less /sys/kernel/debug/tracing/README
... and that goes for pretty much all other operations under /sys which will be described here.
First of all, here is a simple minimal module/driver code (which I put together from the referred resources), which simply creates a /proc/testmod-sample file node, which returns the string "This is testmod." when it is being read; this is testmod.c:
/*
https://github.com/spotify/linux/blob/master/samples/tracepoints/tracepoint-sample.c
https://www.linux.com/learn/linux-training/37985-the-kernel-newbie-corner-kernel-debugging-using-proc-qsequenceq-files-part-1
*/
#include <linux/module.h>
#include <linux/sched.h>
#include <linux/proc_fs.h>
#include <linux/seq_file.h> // for sequence files
struct proc_dir_entry *pentry_sample;
char *defaultOutput = "This is testmod.";
static int my_show(struct seq_file *m, void *v)
{
seq_printf(m, "%s\n", defaultOutput);
return 0;
}
static int my_open(struct inode *inode, struct file *file)
{
return single_open(file, my_show, NULL);
}
static const struct file_operations mark_ops = {
.owner = THIS_MODULE,
.open = my_open,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
};
static int __init sample_init(void)
{
printk(KERN_ALERT "sample init\n");
pentry_sample = proc_create(
"testmod-sample", 0444, NULL, &mark_ops);
if (!pentry_sample)
return -EPERM;
return 0;
}
static void __exit sample_exit(void)
{
printk(KERN_ALERT "sample exit\n");
remove_proc_entry("testmod-sample", NULL);
}
module_init(sample_init);
module_exit(sample_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Mathieu Desnoyers et al.");
MODULE_DESCRIPTION("based on Tracepoint sample");
This module can be built with the following Makefile (just have it placed in the same directory as testmod.c, and then run make in that same directory):
CONFIG_MODULE_FORCE_UNLOAD=y
# for oprofile
DEBUG_INFO=y
EXTRA_CFLAGS=-g -O0
obj-m += testmod.o
# mind the tab characters needed at start here:
all:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules
clean:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean
When this module/driver is built, the output is a kernel object file, testmod.ko.
At this point, we can prepare the event tracing related to try_module_get and try_module_put; those are in /sys/kernel/debug/tracing/events/module:
$ sudo ls /sys/kernel/debug/tracing/events/module
enable filter module_free module_get module_load module_put module_request
Note that on my system, tracing is by default enabled:
$ sudo cat /sys/kernel/debug/tracing/tracing_enabled
1
... however, the module tracing (specifically) is not:
$ sudo cat /sys/kernel/debug/tracing/events/module/enable
0
Now, we should first make a filter, that will react on the module_get, module_put etc events, but only for the testmod module. To do that, we should first check the format of the event:
$ sudo cat /sys/kernel/debug/tracing/events/module/module_put/format
name: module_put
ID: 312
format:
...
field:__data_loc char[] name; offset:20; size:4; signed:1;
print fmt: "%s call_site=%pf refcnt=%d", __get_str(name), (void *)REC->ip, REC->refcnt
Here we can see that there is a field called name, which holds the driver name, which we can filter against. To create a filter, we simply echo the filter string into the corresponding file:
sudo bash -c "echo name == testmod > /sys/kernel/debug/tracing/events/module/filter"
Here, first note that since we have to call sudo, we have to wrap the whole echo redirection as an argument command of a sudo-ed bash. Second, note that since we wrote to the "parent" module/filter, not the specific events (which would be module/module_put/filter etc), this filter will be applied to all events listed as "children" of module directory.
Finally, we enable tracing for module:
sudo bash -c "echo 1 > /sys/kernel/debug/tracing/events/module/enable"
From this point on, we can read the trace log file; for me, reading the blocking,
"piped" version of the trace file worked - like this:
sudo cat /sys/kernel/debug/tracing/trace_pipe | tee tracelog.txt
At this point, we will not see anything in the log - so it is time to load (and utilize, and remove) the driver (in a different terminal from where trace_pipe is being read):
$ sudo insmod ./testmod.ko
$ cat /proc/testmod-sample
This is testmod.
$ sudo rmmod testmod
If we go back to the terminal where trace_pipe is being read, we should see something like:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
insmod-21137 [001] 28038.101509: module_load: testmod
insmod-21137 [001] 28038.103904: module_put: testmod call_site=sys_init_module refcnt=2
rmmod-21354 [000] 28080.244448: module_free: testmod
That is pretty much all we will obtain for our testmod driver - the refcount changes only when the driver is loaded (insmod) or unloaded (rmmod), not when we do a read through cat. So we can simply interrupt the read from trace_pipe with CTRL+C in that terminal; and to stop the tracing altogether:
sudo bash -c "echo 0 > /sys/kernel/debug/tracing/tracing_enabled"
Here, note that most examples refer to reading the file /sys/kernel/debug/tracing/trace instead of trace_pipe as here. However, one problem is that this file is not meant to be "piped" (so you shouldn't run a tail -f on this trace file); but instead you should re-read the trace after each operation. After the first insmod, we would obtain the same output from cat-ing both trace and trace_pipe; however, after the rmmod, reading the trace file would give:
<...>-21137 [001] 28038.101509: module_load: testmod
<...>-21137 [001] 28038.103904: module_put: testmod call_site=sys_init_module refcnt=2
rmmod-21354 [000] 28080.244448: module_free: testmod
... that is: at this point, the insmod had already been exited for long, and so it doesn't exist anymore in the process list - and therefore cannot be found via the recorded process ID (PID) at the time - thus we get a blank <...> as process name. Therefore, it is better to log (via tee) a running output from trace_pipe in this case. Also, note that in order to clear/reset/erase the trace file, one simply writes a 0 to it:
sudo bash -c "echo 0 > /sys/kernel/debug/tracing/trace"
If this seems counterintuitive, note that trace is a special file, and will always report a file size of zero anyways:
$ sudo ls -la /sys/kernel/debug/tracing/trace
-rw-r--r-- 1 root root 0 2013-03-19 06:39 /sys/kernel/debug/tracing/trace
... even if it is "full".
Finally, note that if we didn't implement a filter, we would have obtained a log of all module calls on the running system - which would log any call (also background) to grep and such, as those use the binfmt_misc module:
...
tr-6232 [001] 25149.815373: module_put: binfmt_misc call_site=search_binary_handler refcnt=133194
..
grep-6231 [001] 25149.816923: module_put: binfmt_misc call_site=search_binary_handler refcnt=133196
..
cut-6233 [000] 25149.817842: module_put: binfmt_misc call_site=search_binary_handler refcnt=129669
..
sudo-6234 [001] 25150.289519: module_put: binfmt_misc call_site=search_binary_handler refcnt=133198
..
tail-6235 [000] 25150.316002: module_put: binfmt_misc call_site=search_binary_handler refcnt=129671
... which adds quite a bit of overhead (in both log data ammount, and processing time required to generate it).
While looking this up, I stumbled upon Debugging Linux Kernel by Ftrace PDF, which refers to a tool trace-cmd, which pretty much does the similar as above - but through an easier command line interface. There is also a "front-end reader" GUI for trace-cmd called KernelShark; both of these are also in Debian/Ubuntu repositories via sudo apt-get install trace-cmd kernelshark. These tools could be an alternative to the procedure described above.
Finally, I'd just note that, while the above testmod example doesn't really show use in context of multiple claims, I have used the same tracing procedure to discover that an USB module I'm coding, was repeatedly claimed by pulseaudio as soon as the USB device was plugged in - so the procedure seems to work for such use cases.
It says on the Linux Kernel Module Programming Guide that the use count of a module is controlled by the functions try_module_get and module_put. Perhaps you can find where these functions are called for your module.
More info: https://www.kernel.org/doc/htmldocs/kernel-hacking/routines-module-use-counters.html
All you get are a list of which modules depend on which other modules (the Used by column in lsmod). You can't write a program to tell why the module was loaded, if it is still needed for anything, or what might break if you unload it and everything that depends on it.
You might try lsof or fuser.
If you use rmmod WITHOUT the --force option, it will tell you what is using a module. Example:
$ lsmod | grep firewire
firewire_ohci 24695 0
firewire_core 50151 1 firewire_ohci
crc_itu_t 1717 1 firewire_core
$ sudo modprobe -r firewire-core
FATAL: Module firewire_core is in use.
$ sudo rmmod firewire_core
ERROR: Module firewire_core is in use by firewire_ohci
$ sudo modprobe -r firewire-ohci
$ sudo modprobe -r firewire-core
$ lsmod | grep firewire
$
try kgdb and set breakpoint to your module
For anyone desperate to figure out why they can't reload modules, I was able to work around this problem by
Getting the path of the currently used module using "modinfo"
rm -rfing it
Copying the new module I wanted to load to the path it was in
Typing "modprobe DRIVER_NAME.ko".

How to generate a core dump in Linux on a segmentation fault?

I have a process in Linux that's getting a segmentation fault. How can I tell it to generate a core dump when it fails?
This depends on what shell you are using. If you are using bash, then the ulimit command controls several settings relating to program execution, such as whether you should dump core. If you type
ulimit -c unlimited
then that will tell bash that its programs can dump cores of any size. You can specify a size such as 52M instead of unlimited if you want, but in practice this shouldn't be necessary since the size of core files will probably never be an issue for you.
In tcsh, you'd type
limit coredumpsize unlimited
As explained above the real question being asked here is how to enable core dumps on a system where they are not enabled. That question is answered here.
If you've come here hoping to learn how to generate a core dump for a hung process, the answer is
gcore <pid>
if gcore is not available on your system then
kill -ABRT <pid>
Don't use kill -SEGV as that will often invoke a signal handler making it harder to diagnose the stuck process
To check where the core dumps are generated, run:
sysctl kernel.core_pattern
or:
cat /proc/sys/kernel/core_pattern
where %e is the process name and %t the system time. You can change it in /etc/sysctl.conf and reloading by sysctl -p.
If the core files are not generated (test it by: sleep 10 & and killall -SIGSEGV sleep), check the limits by: ulimit -a.
If your core file size is limited, run:
ulimit -c unlimited
to make it unlimited.
Then test again, if the core dumping is successful, you will see “(core dumped)” after the segmentation fault indication as below:
Segmentation fault: 11 (core dumped)
See also: core dumped - but core file is not in current directory?
Ubuntu
In Ubuntu the core dumps are handled by Apport and can be located in /var/crash/. However, it is disabled by default in stable releases.
For more details, please check: Where do I find the core dump in Ubuntu?.
macOS
For macOS, see: How to generate core dumps in Mac OS X?
What I did at the end was attach gdb to the process before it crashed, and then when it got the segfault I executed the generate-core-file command. That forced generation of a core dump.
Maybe you could do it this way, this program is a demonstration of how to trap a segmentation fault and shells out to a debugger (this is the original code used under AIX) and prints the stack trace up to the point of a segmentation fault. You will need to change the sprintf variable to use gdb in the case of Linux.
#include <stdio.h>
#include <signal.h>
#include <stdlib.h>
#include <stdarg.h>
static void signal_handler(int);
static void dumpstack(void);
static void cleanup(void);
void init_signals(void);
void panic(const char *, ...);
struct sigaction sigact;
char *progname;
int main(int argc, char **argv) {
char *s;
progname = *(argv);
atexit(cleanup);
init_signals();
printf("About to seg fault by assigning zero to *s\n");
*s = 0;
sigemptyset(&sigact.sa_mask);
return 0;
}
void init_signals(void) {
sigact.sa_handler = signal_handler;
sigemptyset(&sigact.sa_mask);
sigact.sa_flags = 0;
sigaction(SIGINT, &sigact, (struct sigaction *)NULL);
sigaddset(&sigact.sa_mask, SIGSEGV);
sigaction(SIGSEGV, &sigact, (struct sigaction *)NULL);
sigaddset(&sigact.sa_mask, SIGBUS);
sigaction(SIGBUS, &sigact, (struct sigaction *)NULL);
sigaddset(&sigact.sa_mask, SIGQUIT);
sigaction(SIGQUIT, &sigact, (struct sigaction *)NULL);
sigaddset(&sigact.sa_mask, SIGHUP);
sigaction(SIGHUP, &sigact, (struct sigaction *)NULL);
sigaddset(&sigact.sa_mask, SIGKILL);
sigaction(SIGKILL, &sigact, (struct sigaction *)NULL);
}
static void signal_handler(int sig) {
if (sig == SIGHUP) panic("FATAL: Program hanged up\n");
if (sig == SIGSEGV || sig == SIGBUS){
dumpstack();
panic("FATAL: %s Fault. Logged StackTrace\n", (sig == SIGSEGV) ? "Segmentation" : ((sig == SIGBUS) ? "Bus" : "Unknown"));
}
if (sig == SIGQUIT) panic("QUIT signal ended program\n");
if (sig == SIGKILL) panic("KILL signal ended program\n");
if (sig == SIGINT) ;
}
void panic(const char *fmt, ...) {
char buf[50];
va_list argptr;
va_start(argptr, fmt);
vsprintf(buf, fmt, argptr);
va_end(argptr);
fprintf(stderr, buf);
exit(-1);
}
static void dumpstack(void) {
/* Got this routine from http://www.whitefang.com/unix/faq_toc.html
** Section 6.5. Modified to redirect to file to prevent clutter
*/
/* This needs to be changed... */
char dbx[160];
sprintf(dbx, "echo 'where\ndetach' | dbx -a %d > %s.dump", getpid(), progname);
/* Change the dbx to gdb */
system(dbx);
return;
}
void cleanup(void) {
sigemptyset(&sigact.sa_mask);
/* Do any cleaning up chores here */
}
You may have to additionally add a parameter to get gdb to dump the core as shown here in this blog here.
There are more things that may influence the generation of a core dump. I encountered these:
the directory for the dump must be writable. By default this is the current directory of the process, but that may be changed by setting /proc/sys/kernel/core_pattern.
in some conditions, the kernel value in /proc/sys/fs/suid_dumpable may prevent the core to be generated.
There are more situations which may prevent the generation that are described in the man page - try man core.
For Ubuntu 14.04
Check core dump enabled:
ulimit -a
One of the lines should be :
core file size (blocks, -c) unlimited
If not :
gedit ~/.bashrc and add ulimit -c unlimited to end of file and save, re-run terminal.
Build your application with debug information :
In Makefile -O0 -g
Run application that create core dump (core dump file with name ‘core’ should be created near application_name file):
./application_name
Run under gdb:
gdb application_name core
In order to activate the core dump do the following:
In /etc/profile comment the line:
# ulimit -S -c 0 > /dev/null 2>&1
In /etc/security/limits.conf comment out the line:
* soft core 0
execute the cmd limit coredumpsize unlimited and check it with cmd limit:
# limit coredumpsize unlimited
# limit
cputime unlimited
filesize unlimited
datasize unlimited
stacksize 10240 kbytes
coredumpsize unlimited
memoryuse unlimited
vmemoryuse unlimited
descriptors 1024
memorylocked 32 kbytes
maxproc 528383
#
to check if the corefile gets written you can kill the relating process with cmd kill -s SEGV <PID> (should not be needed, just in case no core file gets written this can be used as a check):
# kill -s SEGV <PID>
Once the corefile has been written make sure to deactivate the coredump settings again in the relating files (1./2./3.) !
Ubuntu 19.04
All other answers themselves didn't help me. But the following sum up did the job
Create ~/.config/apport/settings with the following content:
[main]
unpackaged=true
(This tells apport to also write core dumps for custom apps)
check: ulimit -c. If it outputs 0, fix it with
ulimit -c unlimited
Just for in case restart apport:
sudo systemctl restart apport
Crash files are now written in /var/crash/. But you cannot use them with gdb. To use them with gdb, use
apport-unpack <location_of_report> <target_directory>
Further information:
Some answers suggest changing core_pattern. Be aware, that that file might get overwritten by the apport service on restarting.
Simply stopping apport did not do the job
The ulimit -c value might get changed automatically while you're trying other answers of the web. Be sure to check it regularly during setting up your core dump creation.
References:
https://stackoverflow.com/a/47481884/6702598
By default you will get a core file. Check to see that the current directory of the process is writable, or no core file will be created.
Better to turn on core dump programmatically using system call setrlimit.
example:
#include <sys/resource.h>
bool enable_core_dump(){
struct rlimit corelim;
corelim.rlim_cur = RLIM_INFINITY;
corelim.rlim_max = RLIM_INFINITY;
return (0 == setrlimit(RLIMIT_CORE, &corelim));
}
It's worth mentioning that if you have a systemd set up, then things are a little bit different. The set up typically would have the core files be piped, by means of core_pattern sysctl value, through systemd-coredump(8). The core file size rlimit would typically be configured as "unlimited" already.
It is then possible to retrieve the core dumps using coredumpctl(1).
The storage of core dumps, etc. is configured by coredump.conf(5). There are examples of how to get the core files in the coredumpctl man page, but in short, it would look like this:
Find the core file:
[vps#phoenix]~$ coredumpctl list test_me | tail -1
Sun 2019-01-20 11:17:33 CET 16163 1224 1224 11 present /home/vps/test_me
Get the core file:
[vps#phoenix]~$ coredumpctl -o test_me.core dump 16163
This is typically sufficient:
ulimit -c unlimited
Note this will not persist between ssh sections! To add persistence:
echo '* soft core unlimited' >> /etc/security/limits.conf
Now, if you're using Ubuntu, "apport" is probably running. Here's how to check:
sudo systemctl status apport.service
If it is, you'll probably find core dumps in one of these places:
/var/lib/apport/coredump
/var/crash
If you want to change the location of core dumps
Make sure that you have the permissions to create files and the directory exists in the directory you're sending a core dump to!
Here's an example. Note this will not persist across reboots:
sysctl -w kernel.core_pattern=/coredumps/core-%e-%s-%u-%g-%p-%t
mkdir /coredumps
Make sure that the process that's crashing has access to write to this. The easiest way would be an example like this:
chmod 777 /coredumps
Test that core dumps works
> crash.c
gcc -Wl,--defsym=main=0 crash.c
./a.out
==output== Segmentation fault (core dumped)
If it doesn't say "core dumped" above, something isn't working.

Resources