How to use -V option in perf probe? - linux

I'm trying to print a value of variable in specific function using perf. So I tried perf probe with -V option, but I got error message like this.
# perf probe -V tcp_sendmsg
Failed to find the path for the kernel: Invalid ELF file
Error: Failed to show vars.
So I downloaded kernel symbol&source package and I checked that the value CONFIG_DEBUG_INFO in /boot/config-5.3.0-46-generic'set to 1. But still the same error occurs.
How should I fix this problem?
Ubuntu 18.04 LTS
release: 5.3.0-46-generic
version: 38~18.04.01

To make perf probe -V work, you will have to pass a vmlinux file that has debuginfo, which you seem to have done as per your comment -
sudo perf probe --vmlinux /usr/lib/debug/boot/vmlinux-4.15.0-1077-azure
-V tcp_sendmsg
Available variables at tcp_sendmsg
#<tcp_sendmsg+0>
size_t size
struct msghdr* msg
struct sock* sk
Now as per the man-page, perf probe -L means
-L, --line=
Show source code lines which can be probed.
This needs an argument which specifies a range of the source code.
(see LINE SYNTAX for detail)
which means you need to specify the location of the kernel source code, when you run perf probe -L, otherwise the elfutils tool, that analyzes the vmlinux file with debuginfo would not be able to determine the right path for the kernel sources.
You can download the kernel sources as mentioned here
And then run perf probe where you specify the --source switch,
sudo perf probe --source /usr/src/linux-source-4.4.0 --vmlinux /usr/lib/debug/boot/vmlinux-4.15.0-1077-azure -L tcp_sendmsg
<tcp_sendmsg#/usr/src/linux-source-4.4.0//net/ipv4/tcp.c:0>
0 /* Clear memory counter. */
1 tp->ucopy.memory = 0;
}
4 static struct sk_buff *tcp_recv_skb(struct sock *sk, u32 seq, u32 *off)
5 {
6 struct sk_buff *skb;
u32 offset;
9 while ((skb = skb_peek(&sk->sk_receive_queue)) != NULL) {
offset = seq - TCP_SKB_CB(skb)->seq;
if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_SYN)
offset--;

Related

Does GNU time memory output account for child processes too?

When running GNU time (/usr/bin/time) and checking for memory consumption, does its output account for the memory usage of the child processes of my target program?
Could not find anything in GNU's time manpage.
Yes.
You can easily check with:
$ /usr/bin/time -f '%M' sh -c 'perl -e "\$y=q{x}x(2*1024*1024)" & wait'
8132
$ /usr/bin/time -f '%M' sh -c 'perl -e "\$y=q{x}x(8*1024*1024)" & wait'
20648
GNU time is using the wait4 system call on Linux (via the wait3 glibc wrapper), and while undocumented, the resource usage it returns in the struct rusage also includes the descendands of the process waited for. You can look at the kernel implementation of wait4 in kernel/exit.c for all the details:
$ grep -C2 RUSAGE_BOTH include/uapi/linux/resource.h
#define RUSAGE_SELF 0
#define RUSAGE_CHILDREN (-1)
#define RUSAGE_BOTH (-2) /* sys_wait4() uses this */
#define RUSAGE_THREAD 1 /* only the calling thread */
FreeBSD and NetBSD also have a wait6 system call which returns separate info for the process waited for and for its descendants. They also clearly document that the rusage returned by wait3 and wait4 also includes grandchildren.

why does perf record and annotate not work?

I'm stumped, I read the perf tutorial and am trying to do a simple test beyond "perf stat" which works. However perf record either doesnt work ,or perf annotate shows no samples recorded. Running perf
For example(im running with sudo because without it i get a bunch of errors which i will post at the end):
sudo perf record -e cycles,instructions,cache-misses -a -c 1 ./FooExe
[ perf record: Woken up 4 times to write data ]
[ perf record: Captured and wrote 1.794 MB perf.data (~78393 samples) ]
.
sudo perf report -D -i perf.data |grep RECORD_SAMPLE |wc -l
Failed to open /tmp/perf-23796.map, continuing without symbols
20486
.
sudo perf annotate -d ./FooExe
the perf.data file has no samples! Press any key
So thats as far as i get. I tried to rebuild perf for my ssystem from source but that didnt seem to help either.
Im using Ubuntu 14.04 kernel 3.19.0-49-generic. This is on intel i7 I4510U cpu . I made sure to compile my program with symbols , but i get the same results regardless of which application i try to profile.
-- if i run without sudo :
WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,
check /proc/sys/kernel/kptr_restrict.
Samples in kernel functions may not be resolved if a suitable vmlinux
file is not found in the buildid cache or in the vmlinux path.
Samples in kernel modules won't be resolved at all.
If some relocation was applied (e.g. kexec) symbols may be misresolved
even with a suitable vmlinux or kallsyms file.
Cannot read kernel map
Error:
You may not have permission to collect system-wide stats.
Consider tweaking /proc/sys/kernel/perf_event_paranoid:
-1 - Not paranoid at all
0 - Disallow raw tracepoint access for unpriv
1 - Disallow cpu events for unpriv
2 - Disallow kernel profiling for unpriv
I just tried your command. The problem was that you used -a to profile all processes system-wide, so it never ran ./FooExe. You can confirm this with strace -f perf ... ./FooExe, and note the lack of any execve system call. And also the fact that it returns instantly, even if FooExe should have taken several seconds.
Here's an example of recording samples for a busy-loop awk command:
perf record -e cycles,instructions,cache-misses awk 'BEGIN{for(i=0;i<40000000;i++){}}'
Now perf report works. You don't need to specify the executable for the report command, because perf.data only has data for the one executable.
This works the same way with the ocperf.py wrapper, but you could record events for more uarch-specific events using symbolic names (instead of looking up codes and numeric arguments in -e):
$ ocperf.py record -e cycles,cache-misses,uops_dispatched_port.port_0 awk 'BEGIN{for(i=0;i<40000000;i++){}}'
perf record -e cycles,cache-misses,cpu/event=0xa1,umask=0x1,name=uops_dispatched_port_port_0,period=2000003/ awk 'BEGIN{for(i=0;i<40000000;i++){}}'
(warning lines about kernel symbols)
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.352 MB perf.data (7819 samples)
$ ocperf.py report

Does perf lock profile user space mutexes?

Summary: Does perf lock profile pthread_mutex?
Details:
The tool perf has an option perf lock. The man page says:
You can analyze various lock behaviours and statistics with this perf lock command.
'perf lock record <command>' records lock events
between start and end <command>. And this command
produces the file "perf.data" which contains tracing
results of lock events.
'perf lock trace' shows raw lock events.
'perf lock report' reports statistical data.
But when I tried running perf lock record I got an error saying: invalid or unsupported event: 'lock:lock_acquire'. I looked and it seems that error is probably because my kernel is not compiled with CONFIG_LOCKDEP or CONFIG_LOCK_STAT.
My question is: does perf lock report events related to user-space locks (like pthread_mutex) or only kernel locks? I'm more interested in profiling application that mostly run in user-space. I thought this option in perf looked interesting, but since I can't run it without compiling (or getting) a new kernel I'm interested in getting a better idea of what it does before I try.
Summary: Does perf lock profile pthread_mutex?
Summary: no, because there are no any tracepoint defined in user-space pthread_mutex.
According to source file tools/perf/builtin-lock.c (http://lxr.free-electrons.com/source/tools/perf/builtin-lock.c#L939) cmd_lock calls __cmd_record, which defines several tracepoints for perf record (via -e TRACEPOINT_NAME) and also pass options -R -m 1024 -c 1 to perf report. List of tracepoints defined: lock_tracepoints:
842 static const struct perf_evsel_str_handler lock_tracepoints[] = {
843 { "lock:lock_acquire", perf_evsel__process_lock_acquire, }, /* CONFIG_LOCKDEP */
844 { "lock:lock_acquired", perf_evsel__process_lock_acquired, }, /* CONFIG_LOCKDEP, CONFIG_LOCK_STAT */
845 { "lock:lock_contended", perf_evsel__process_lock_contended, }, /* CONFIG_LOCKDEP, CONFIG_LOCK_STAT */
846 { "lock:lock_release", perf_evsel__process_lock_release, }, /* CONFIG_LOCKDEP */
847 };
TRACE_EVENT(lock_acquire,.. is defined in trace/events/lock.h. And
trace_lock_acquire is defined only in kernel/locking/lockdep.c (recheck in debian codebase: http://codesearch.debian.net/search?q=trace_lock_acquire).
Only CONFIG_LOCKDEP is missing from your kernel according to kernel/locking/Makefile: obj-$(CONFIG_LOCKDEP) += lockdep.o (tracepoints are defined unconditionally in the lockdep.c.
According to https://www.kernel.org/doc/Documentation/trace/tracepoints.txt all tracepoints are kernel-only, so perf lock will not profile user-space locks.
You can try tracepoints from LTTng, the project which declares user-space tracepoints (http://lttng.org/ust). But there will be no ready lock statistics, only raw data on tracepoints. Also you should define tracepoints with tracef() macro (recompile pthreads/glibc, or try to create your own wrapper around pthread).
No. But maybe what you need can be done by recording sched_stat_sleep and sched_switch events:
$ perf record -e sched:sched_stat_sleep,sched:sched_switch -g -o perf.data.raw yourprog
$ perf inject -v -s -i perf.data.raw -o perf.data
$ perf report --stdio --no-children
Note: make sure your kernel is compiled with CONFIG_SCHEDSTATS=y and it is enabled at /proc/sys/kernel/sched_schedstats
See more at https://perf.wiki.kernel.org/index.php/Tutorial#Profiling_sleep_times.

Is there a way to figure out what is using a Linux kernel module?

If I load a kernel module and list the loaded modules with lsmod, I can get the "use count" of the module (number of other modules with a reference to the module). Is there a way to figure out what is using a module, though?
The issue is that a module I am developing insists its use count is 1 and thus I cannot use rmmod to unload it, but its "by" column is empty. This means that every time I want to re-compile and re-load the module, I have to reboot the machine (or, at least, I can't figure out any other way to unload it).
Actually, there seems to be a way to list processes that claim a module/driver - however, I haven't seen it advertised (outside of Linux kernel documentation), so I'll jot down my notes here:
First of all, many thanks for #haggai_e's answer; the pointer to the functions try_module_get and try_module_put as those responsible for managing the use count (refcount) was the key that allowed me to track down the procedure.
Looking further for this online, I somehow stumbled upon the post Linux-Kernel Archive: [PATCH 1/2] tracing: Reduce overhead of module tracepoints; which finally pointed to a facility present in the kernel, known as (I guess) "tracing"; the documentation for this is in the directory Documentation/trace - Linux kernel source tree. In particular, two files explain the tracing facility, events.txt and ftrace.txt.
But, there is also a short "tracing mini-HOWTO" on a running Linux system in /sys/kernel/debug/tracing/README (see also I'm really really tired of people saying that there's no documentation…); note that in the kernel source tree, this file is actually generated by the file kernel/trace/trace.c. I've tested this on Ubuntu natty, and note that since /sys is owned by root, you have to use sudo to read this file, as in sudo cat or
sudo less /sys/kernel/debug/tracing/README
... and that goes for pretty much all other operations under /sys which will be described here.
First of all, here is a simple minimal module/driver code (which I put together from the referred resources), which simply creates a /proc/testmod-sample file node, which returns the string "This is testmod." when it is being read; this is testmod.c:
/*
https://github.com/spotify/linux/blob/master/samples/tracepoints/tracepoint-sample.c
https://www.linux.com/learn/linux-training/37985-the-kernel-newbie-corner-kernel-debugging-using-proc-qsequenceq-files-part-1
*/
#include <linux/module.h>
#include <linux/sched.h>
#include <linux/proc_fs.h>
#include <linux/seq_file.h> // for sequence files
struct proc_dir_entry *pentry_sample;
char *defaultOutput = "This is testmod.";
static int my_show(struct seq_file *m, void *v)
{
seq_printf(m, "%s\n", defaultOutput);
return 0;
}
static int my_open(struct inode *inode, struct file *file)
{
return single_open(file, my_show, NULL);
}
static const struct file_operations mark_ops = {
.owner = THIS_MODULE,
.open = my_open,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
};
static int __init sample_init(void)
{
printk(KERN_ALERT "sample init\n");
pentry_sample = proc_create(
"testmod-sample", 0444, NULL, &mark_ops);
if (!pentry_sample)
return -EPERM;
return 0;
}
static void __exit sample_exit(void)
{
printk(KERN_ALERT "sample exit\n");
remove_proc_entry("testmod-sample", NULL);
}
module_init(sample_init);
module_exit(sample_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Mathieu Desnoyers et al.");
MODULE_DESCRIPTION("based on Tracepoint sample");
This module can be built with the following Makefile (just have it placed in the same directory as testmod.c, and then run make in that same directory):
CONFIG_MODULE_FORCE_UNLOAD=y
# for oprofile
DEBUG_INFO=y
EXTRA_CFLAGS=-g -O0
obj-m += testmod.o
# mind the tab characters needed at start here:
all:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules
clean:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean
When this module/driver is built, the output is a kernel object file, testmod.ko.
At this point, we can prepare the event tracing related to try_module_get and try_module_put; those are in /sys/kernel/debug/tracing/events/module:
$ sudo ls /sys/kernel/debug/tracing/events/module
enable filter module_free module_get module_load module_put module_request
Note that on my system, tracing is by default enabled:
$ sudo cat /sys/kernel/debug/tracing/tracing_enabled
1
... however, the module tracing (specifically) is not:
$ sudo cat /sys/kernel/debug/tracing/events/module/enable
0
Now, we should first make a filter, that will react on the module_get, module_put etc events, but only for the testmod module. To do that, we should first check the format of the event:
$ sudo cat /sys/kernel/debug/tracing/events/module/module_put/format
name: module_put
ID: 312
format:
...
field:__data_loc char[] name; offset:20; size:4; signed:1;
print fmt: "%s call_site=%pf refcnt=%d", __get_str(name), (void *)REC->ip, REC->refcnt
Here we can see that there is a field called name, which holds the driver name, which we can filter against. To create a filter, we simply echo the filter string into the corresponding file:
sudo bash -c "echo name == testmod > /sys/kernel/debug/tracing/events/module/filter"
Here, first note that since we have to call sudo, we have to wrap the whole echo redirection as an argument command of a sudo-ed bash. Second, note that since we wrote to the "parent" module/filter, not the specific events (which would be module/module_put/filter etc), this filter will be applied to all events listed as "children" of module directory.
Finally, we enable tracing for module:
sudo bash -c "echo 1 > /sys/kernel/debug/tracing/events/module/enable"
From this point on, we can read the trace log file; for me, reading the blocking,
"piped" version of the trace file worked - like this:
sudo cat /sys/kernel/debug/tracing/trace_pipe | tee tracelog.txt
At this point, we will not see anything in the log - so it is time to load (and utilize, and remove) the driver (in a different terminal from where trace_pipe is being read):
$ sudo insmod ./testmod.ko
$ cat /proc/testmod-sample
This is testmod.
$ sudo rmmod testmod
If we go back to the terminal where trace_pipe is being read, we should see something like:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
insmod-21137 [001] 28038.101509: module_load: testmod
insmod-21137 [001] 28038.103904: module_put: testmod call_site=sys_init_module refcnt=2
rmmod-21354 [000] 28080.244448: module_free: testmod
That is pretty much all we will obtain for our testmod driver - the refcount changes only when the driver is loaded (insmod) or unloaded (rmmod), not when we do a read through cat. So we can simply interrupt the read from trace_pipe with CTRL+C in that terminal; and to stop the tracing altogether:
sudo bash -c "echo 0 > /sys/kernel/debug/tracing/tracing_enabled"
Here, note that most examples refer to reading the file /sys/kernel/debug/tracing/trace instead of trace_pipe as here. However, one problem is that this file is not meant to be "piped" (so you shouldn't run a tail -f on this trace file); but instead you should re-read the trace after each operation. After the first insmod, we would obtain the same output from cat-ing both trace and trace_pipe; however, after the rmmod, reading the trace file would give:
<...>-21137 [001] 28038.101509: module_load: testmod
<...>-21137 [001] 28038.103904: module_put: testmod call_site=sys_init_module refcnt=2
rmmod-21354 [000] 28080.244448: module_free: testmod
... that is: at this point, the insmod had already been exited for long, and so it doesn't exist anymore in the process list - and therefore cannot be found via the recorded process ID (PID) at the time - thus we get a blank <...> as process name. Therefore, it is better to log (via tee) a running output from trace_pipe in this case. Also, note that in order to clear/reset/erase the trace file, one simply writes a 0 to it:
sudo bash -c "echo 0 > /sys/kernel/debug/tracing/trace"
If this seems counterintuitive, note that trace is a special file, and will always report a file size of zero anyways:
$ sudo ls -la /sys/kernel/debug/tracing/trace
-rw-r--r-- 1 root root 0 2013-03-19 06:39 /sys/kernel/debug/tracing/trace
... even if it is "full".
Finally, note that if we didn't implement a filter, we would have obtained a log of all module calls on the running system - which would log any call (also background) to grep and such, as those use the binfmt_misc module:
...
tr-6232 [001] 25149.815373: module_put: binfmt_misc call_site=search_binary_handler refcnt=133194
..
grep-6231 [001] 25149.816923: module_put: binfmt_misc call_site=search_binary_handler refcnt=133196
..
cut-6233 [000] 25149.817842: module_put: binfmt_misc call_site=search_binary_handler refcnt=129669
..
sudo-6234 [001] 25150.289519: module_put: binfmt_misc call_site=search_binary_handler refcnt=133198
..
tail-6235 [000] 25150.316002: module_put: binfmt_misc call_site=search_binary_handler refcnt=129671
... which adds quite a bit of overhead (in both log data ammount, and processing time required to generate it).
While looking this up, I stumbled upon Debugging Linux Kernel by Ftrace PDF, which refers to a tool trace-cmd, which pretty much does the similar as above - but through an easier command line interface. There is also a "front-end reader" GUI for trace-cmd called KernelShark; both of these are also in Debian/Ubuntu repositories via sudo apt-get install trace-cmd kernelshark. These tools could be an alternative to the procedure described above.
Finally, I'd just note that, while the above testmod example doesn't really show use in context of multiple claims, I have used the same tracing procedure to discover that an USB module I'm coding, was repeatedly claimed by pulseaudio as soon as the USB device was plugged in - so the procedure seems to work for such use cases.
It says on the Linux Kernel Module Programming Guide that the use count of a module is controlled by the functions try_module_get and module_put. Perhaps you can find where these functions are called for your module.
More info: https://www.kernel.org/doc/htmldocs/kernel-hacking/routines-module-use-counters.html
All you get are a list of which modules depend on which other modules (the Used by column in lsmod). You can't write a program to tell why the module was loaded, if it is still needed for anything, or what might break if you unload it and everything that depends on it.
You might try lsof or fuser.
If you use rmmod WITHOUT the --force option, it will tell you what is using a module. Example:
$ lsmod | grep firewire
firewire_ohci 24695 0
firewire_core 50151 1 firewire_ohci
crc_itu_t 1717 1 firewire_core
$ sudo modprobe -r firewire-core
FATAL: Module firewire_core is in use.
$ sudo rmmod firewire_core
ERROR: Module firewire_core is in use by firewire_ohci
$ sudo modprobe -r firewire-ohci
$ sudo modprobe -r firewire-core
$ lsmod | grep firewire
$
try kgdb and set breakpoint to your module
For anyone desperate to figure out why they can't reload modules, I was able to work around this problem by
Getting the path of the currently used module using "modinfo"
rm -rfing it
Copying the new module I wanted to load to the path it was in
Typing "modprobe DRIVER_NAME.ko".

How to generate a core dump in Linux on a segmentation fault?

I have a process in Linux that's getting a segmentation fault. How can I tell it to generate a core dump when it fails?
This depends on what shell you are using. If you are using bash, then the ulimit command controls several settings relating to program execution, such as whether you should dump core. If you type
ulimit -c unlimited
then that will tell bash that its programs can dump cores of any size. You can specify a size such as 52M instead of unlimited if you want, but in practice this shouldn't be necessary since the size of core files will probably never be an issue for you.
In tcsh, you'd type
limit coredumpsize unlimited
As explained above the real question being asked here is how to enable core dumps on a system where they are not enabled. That question is answered here.
If you've come here hoping to learn how to generate a core dump for a hung process, the answer is
gcore <pid>
if gcore is not available on your system then
kill -ABRT <pid>
Don't use kill -SEGV as that will often invoke a signal handler making it harder to diagnose the stuck process
To check where the core dumps are generated, run:
sysctl kernel.core_pattern
or:
cat /proc/sys/kernel/core_pattern
where %e is the process name and %t the system time. You can change it in /etc/sysctl.conf and reloading by sysctl -p.
If the core files are not generated (test it by: sleep 10 & and killall -SIGSEGV sleep), check the limits by: ulimit -a.
If your core file size is limited, run:
ulimit -c unlimited
to make it unlimited.
Then test again, if the core dumping is successful, you will see “(core dumped)” after the segmentation fault indication as below:
Segmentation fault: 11 (core dumped)
See also: core dumped - but core file is not in current directory?
Ubuntu
In Ubuntu the core dumps are handled by Apport and can be located in /var/crash/. However, it is disabled by default in stable releases.
For more details, please check: Where do I find the core dump in Ubuntu?.
macOS
For macOS, see: How to generate core dumps in Mac OS X?
What I did at the end was attach gdb to the process before it crashed, and then when it got the segfault I executed the generate-core-file command. That forced generation of a core dump.
Maybe you could do it this way, this program is a demonstration of how to trap a segmentation fault and shells out to a debugger (this is the original code used under AIX) and prints the stack trace up to the point of a segmentation fault. You will need to change the sprintf variable to use gdb in the case of Linux.
#include <stdio.h>
#include <signal.h>
#include <stdlib.h>
#include <stdarg.h>
static void signal_handler(int);
static void dumpstack(void);
static void cleanup(void);
void init_signals(void);
void panic(const char *, ...);
struct sigaction sigact;
char *progname;
int main(int argc, char **argv) {
char *s;
progname = *(argv);
atexit(cleanup);
init_signals();
printf("About to seg fault by assigning zero to *s\n");
*s = 0;
sigemptyset(&sigact.sa_mask);
return 0;
}
void init_signals(void) {
sigact.sa_handler = signal_handler;
sigemptyset(&sigact.sa_mask);
sigact.sa_flags = 0;
sigaction(SIGINT, &sigact, (struct sigaction *)NULL);
sigaddset(&sigact.sa_mask, SIGSEGV);
sigaction(SIGSEGV, &sigact, (struct sigaction *)NULL);
sigaddset(&sigact.sa_mask, SIGBUS);
sigaction(SIGBUS, &sigact, (struct sigaction *)NULL);
sigaddset(&sigact.sa_mask, SIGQUIT);
sigaction(SIGQUIT, &sigact, (struct sigaction *)NULL);
sigaddset(&sigact.sa_mask, SIGHUP);
sigaction(SIGHUP, &sigact, (struct sigaction *)NULL);
sigaddset(&sigact.sa_mask, SIGKILL);
sigaction(SIGKILL, &sigact, (struct sigaction *)NULL);
}
static void signal_handler(int sig) {
if (sig == SIGHUP) panic("FATAL: Program hanged up\n");
if (sig == SIGSEGV || sig == SIGBUS){
dumpstack();
panic("FATAL: %s Fault. Logged StackTrace\n", (sig == SIGSEGV) ? "Segmentation" : ((sig == SIGBUS) ? "Bus" : "Unknown"));
}
if (sig == SIGQUIT) panic("QUIT signal ended program\n");
if (sig == SIGKILL) panic("KILL signal ended program\n");
if (sig == SIGINT) ;
}
void panic(const char *fmt, ...) {
char buf[50];
va_list argptr;
va_start(argptr, fmt);
vsprintf(buf, fmt, argptr);
va_end(argptr);
fprintf(stderr, buf);
exit(-1);
}
static void dumpstack(void) {
/* Got this routine from http://www.whitefang.com/unix/faq_toc.html
** Section 6.5. Modified to redirect to file to prevent clutter
*/
/* This needs to be changed... */
char dbx[160];
sprintf(dbx, "echo 'where\ndetach' | dbx -a %d > %s.dump", getpid(), progname);
/* Change the dbx to gdb */
system(dbx);
return;
}
void cleanup(void) {
sigemptyset(&sigact.sa_mask);
/* Do any cleaning up chores here */
}
You may have to additionally add a parameter to get gdb to dump the core as shown here in this blog here.
There are more things that may influence the generation of a core dump. I encountered these:
the directory for the dump must be writable. By default this is the current directory of the process, but that may be changed by setting /proc/sys/kernel/core_pattern.
in some conditions, the kernel value in /proc/sys/fs/suid_dumpable may prevent the core to be generated.
There are more situations which may prevent the generation that are described in the man page - try man core.
For Ubuntu 14.04
Check core dump enabled:
ulimit -a
One of the lines should be :
core file size (blocks, -c) unlimited
If not :
gedit ~/.bashrc and add ulimit -c unlimited to end of file and save, re-run terminal.
Build your application with debug information :
In Makefile -O0 -g
Run application that create core dump (core dump file with name ‘core’ should be created near application_name file):
./application_name
Run under gdb:
gdb application_name core
In order to activate the core dump do the following:
In /etc/profile comment the line:
# ulimit -S -c 0 > /dev/null 2>&1
In /etc/security/limits.conf comment out the line:
* soft core 0
execute the cmd limit coredumpsize unlimited and check it with cmd limit:
# limit coredumpsize unlimited
# limit
cputime unlimited
filesize unlimited
datasize unlimited
stacksize 10240 kbytes
coredumpsize unlimited
memoryuse unlimited
vmemoryuse unlimited
descriptors 1024
memorylocked 32 kbytes
maxproc 528383
#
to check if the corefile gets written you can kill the relating process with cmd kill -s SEGV <PID> (should not be needed, just in case no core file gets written this can be used as a check):
# kill -s SEGV <PID>
Once the corefile has been written make sure to deactivate the coredump settings again in the relating files (1./2./3.) !
Ubuntu 19.04
All other answers themselves didn't help me. But the following sum up did the job
Create ~/.config/apport/settings with the following content:
[main]
unpackaged=true
(This tells apport to also write core dumps for custom apps)
check: ulimit -c. If it outputs 0, fix it with
ulimit -c unlimited
Just for in case restart apport:
sudo systemctl restart apport
Crash files are now written in /var/crash/. But you cannot use them with gdb. To use them with gdb, use
apport-unpack <location_of_report> <target_directory>
Further information:
Some answers suggest changing core_pattern. Be aware, that that file might get overwritten by the apport service on restarting.
Simply stopping apport did not do the job
The ulimit -c value might get changed automatically while you're trying other answers of the web. Be sure to check it regularly during setting up your core dump creation.
References:
https://stackoverflow.com/a/47481884/6702598
By default you will get a core file. Check to see that the current directory of the process is writable, or no core file will be created.
Better to turn on core dump programmatically using system call setrlimit.
example:
#include <sys/resource.h>
bool enable_core_dump(){
struct rlimit corelim;
corelim.rlim_cur = RLIM_INFINITY;
corelim.rlim_max = RLIM_INFINITY;
return (0 == setrlimit(RLIMIT_CORE, &corelim));
}
It's worth mentioning that if you have a systemd set up, then things are a little bit different. The set up typically would have the core files be piped, by means of core_pattern sysctl value, through systemd-coredump(8). The core file size rlimit would typically be configured as "unlimited" already.
It is then possible to retrieve the core dumps using coredumpctl(1).
The storage of core dumps, etc. is configured by coredump.conf(5). There are examples of how to get the core files in the coredumpctl man page, but in short, it would look like this:
Find the core file:
[vps#phoenix]~$ coredumpctl list test_me | tail -1
Sun 2019-01-20 11:17:33 CET 16163 1224 1224 11 present /home/vps/test_me
Get the core file:
[vps#phoenix]~$ coredumpctl -o test_me.core dump 16163
This is typically sufficient:
ulimit -c unlimited
Note this will not persist between ssh sections! To add persistence:
echo '* soft core unlimited' >> /etc/security/limits.conf
Now, if you're using Ubuntu, "apport" is probably running. Here's how to check:
sudo systemctl status apport.service
If it is, you'll probably find core dumps in one of these places:
/var/lib/apport/coredump
/var/crash
If you want to change the location of core dumps
Make sure that you have the permissions to create files and the directory exists in the directory you're sending a core dump to!
Here's an example. Note this will not persist across reboots:
sysctl -w kernel.core_pattern=/coredumps/core-%e-%s-%u-%g-%p-%t
mkdir /coredumps
Make sure that the process that's crashing has access to write to this. The easiest way would be an example like this:
chmod 777 /coredumps
Test that core dumps works
> crash.c
gcc -Wl,--defsym=main=0 crash.c
./a.out
==output== Segmentation fault (core dumped)
If it doesn't say "core dumped" above, something isn't working.

Resources