Why is "echo l > /proc/sysrq-trigger" call trace output always similar? - linux

According to the official kernel.org documentation echo l > /proc/sysrq-trigger is supposed to give me the current call trace of all CPUs. But when I do this a couple of times and look into dmesg after that the call traces look completely similar. Why is that?

The same backtrace explanation
In your case, your CPU #0 backtrace is showing that it's executing your sysrq command (judging by write_sysrq_trigger() function):
delay_tsc+0x1f/0x70
arch_trigger_all_cpu_backtrace+0x10a/0x140
__handle_sysrq+0xfc/0x160
write_sysrq_trigger+0x2b/0x30
proc_reg_write+0x39/0x70
vfs_write+0xb2/0x1f0
SyS_write+0x42/0xa0
system_call_fast_compare_end+0x10/0x15
and CPU #1 backtrace is showing that it's in IDLE state (judging by cpuidle_enter_state() function):
cpuidle_enter_state+0x40/0xc0
cpu_startup_entry+0x2f8/0x400
start_secondary+0x20f/0x2d0
Try to load your system very intensively, and then execute your sysrq command to get new backtraces. You will see that one CPU is executing your sysrq command, and second CPU is not in IDLE anymore, but doing some actual work.
User-space backtrace
As for user-space functions on kernel backtrace: although system call is executing (in kernel space) on behalf of user-space process (see Comm: bash in your backtrace for CPU0), it's not possible to print user-space process backtrace using standard kernel backtrace mechanism (which implemented in dump_stack() function). The problem is that the kernel stack doesn't contain any user-space process calls (that's why you can see only kernel functions in your backtraces).
User-space process calls can be found in user-space stack for the corresponding process. For this purpose I would recommend you to use OProfile profiler. Of course, it will give you just a binary stack. In order to obtain actual function names you will need to provide symbols information to gdb.
Details:
[1] kernel stack and user-space stack
[2] how to dump kernel stack in syscall
[3] How to print the userspace stack trace in linux kernelspace

Related

game renderer thread backtrace with no symbols on linux

I have a game application running in linux. We are a gaming company. I am having this random crash that occurs like once in 24-48 hours. The last time it occurred I tried to see the backtrace of the thread where it crashed, however gdb showed that the stack was corrupted with no symbols.
Now, when I run the game and interrupt the gdb, sometimes I am able to see function call stack for this thread but most of the times I do not see any symbols.The thread is a renderer thread.
Some of the game libraries we are using is proprietary third party with no debugging symbols. So I was wondering could it be that the renderer thread call stack is deep(various calls within library) into these libraries without symbols and so I do not get to see the call stack ? If that is true, how can I fix this ?
If not, any idea what could be the cause.
(gdb) bt
#0 0x9f488882 in ?? ()
Also, did a info proc mappings and for the address above in bt I found the following:
0x9f488000 0x9f48a000 0x2000 0x0 /tmp/glyFI8DP (deleted)
This means that your third-party library is using just-in-time compilation to generate some code, mmap it into your process, and deletes it.
On x86_64, GDB needs unwind descriptors to unwind the stack, but it can't get them from the deleted file, so you get no stack trace.
You have a few options:
contact the third-party developers and ask them "how can we get stack traces in this situation?"
dump the contents of the region with GDB dump command:
(gdb) dump /tmp/gly.so 0x9f488000 0x9f48a000
If you are lucky, the resulting binary would actually be an ELF (it doesn't have to be), and may have symbols and unwind descriptors in it. Use readelf --all /tmp/gly.so to look inside.
If it is an ELF file, you can let GDB know that that's what's mapped at 0x9f488000. You'll need to find the address ($tstart below) of .text section in it (should be in readelf output), then:
(gdb) add-symbol-file /tmp/gly.so 0x9f488000+$tstart

Different Stack Pointer at the same point for multiple runs

I am encountering the following behavior for multiple runs of a program on a Linux x86 machine:
the command run is env -i ./prog. The program is crafted to fail with a SIGSEGV
the output from dmesg shows that the Stack Pointer at the time of the SIGSEGV varies after each execution, even though the program's flow is exactly the same
I have disabled the ASLR.
Why is the SP varying after each execution?

access a process's kernel stack given process id in kernel debugging

I have a linux running on VMWare, and I use gdb in the host machine to attach to it when debugging. While running, my kernel will cause some of the processes hang, and I would like to investigate more.
What kernel gives me is the process id of the hung process along with a stack trace. However, without the arguments being passed, stack trace is not very useful. So I want to gather more information. So I have two questions:
Given the pid, how can I get the task_struct corresponds to the process? I tried to do " p find_task_by_pid_ns(2533, &init_pid_ns) " under gdb, however it hangs.
Once I got the task_struct and the stack pointer. My ultimate goal would be to reproduce the stack trace (with argument of each functioned called). Is there a tool to do that? Does gdb take a stack pointer and print the stack trace for me?
Thanks.
KDB will be helpful in this case. I don't know which kernel version you are using, but if you are using kernel on or after linux-2.6.35, you can switch to the kdb from gdb using the following command:
maintenance packet 3
Once you are in the kdb you can use ps command to get to know process descriptor address and can use bt command to trace a stack. Alternatively, you can run the kdb commands from the gdb using gdb 'monitor' command. For example, to use the 'ps' command of kdb, you can type the following command in your gdb.
(gdb) monitor ps
You can get the list of kdb command using the following command.
(gdb) monitor help
Once you know the process descriptor, you can use the following documentation to trace any process's stack.
http://www.emntech.com/documentation/debugging/kdb.pdf

kernel stack trace while carrying out specific command

while typing a command like #ifconfig 10.0.0.10 up is it possible to see all "possible" prints inside kernel.
I know something like echo t > /proc/sysrq-trigger will give you stack trace with respect to processes running in a system.
What I am interested in is, with respect to a 'specific command' how can I get the kernel functions(stack trace) that gets executed?
I know about debuggers like kgdb,but I am interested in quick ways like sysrq methods if any.
Thanks.
The answer to your question is "ftrace". It is not a tool, not a command, but just a kernel feature built into most modern linux kernel.
For example, here you can use ftrace to understand how swap space are implemented (see all the key functions executed and its sequence inside the pastebin files indicated below):
http://tthtlc.wordpress.com/2013/11/19/using-ftrace-to-understanding-linux-kernel-api/
Read this carefully and you can see there are many ways of using ftrace (one is dump kernel stack trace which you requested, another is identifying executed function flow):
http://lwn.net/Articles/366796/
If you don't want to use ftrace, another option is to use QEMU: installing Linux inside the qemu guest is needed, and it is a lot more powerful, as you can use gdb to step through every lines (in C source code) or assembly.
https://tthtlc.wordpress.com/2014/01/14/how-to-do-kernel-debugging-via-gdb-over-serial-port-via-qemu/
Just in case you want to google further, this is called "kgdb", or gdbserver, and outside the qemu you are running a gdb client.
tail -f /var/log/kern.log should display any interaction that occurrs in the kernel.
It is more or less an equivalent to the dmesg command.
strace ifconfig 10.0.0.10 up will show all system calls called by ifconfig, but will not get inside kernel's calls

Segfault on stack overflow

Why does the linux kernel generate a segfault on stack overflow? This can make debugging very awkward when alloca in c or fortran creation of temporary arrays overflows. Surely it mjust be possible for the runtime to produce a more helpful error.
You can actually catch the condition for a stack overflow using signal handlers.
To do this, you must do two things:
Setup a signal handler for SIGSEGV (the segfault) using sigaction, to do this set the SO_ONSTACK flag. This instructs the kernel to use an alternative stack when delivering the signal.
Call sigaltstack() to setup the alternate stack that the handler for SIGSEGV will use.
Then when you overflow the stack, the kernel will switch to your alternate stack before delivering the signal. Once in your signal handler, you can examine the address that caused the fault and determine if it was a stack overflow, or a regular fault.
The "kernel" (it's actually not the kernel running your code, it's the CPU) doesn't know how your code is referencing the memory it's not supposed to be touching. It only knows that you tried to do it.
The code:
char *x = alloca(100);
char y = x[150];
can't really be evaluated by the CPU as you trying to access beyond the bounds of x.
You may hit the exact same address with:
char y = *((char*)(0xdeadbeef));
BTW, I would discourage the use of alloca since stack tends to be much more limited than heap (use malloc instead).
A stack overflow is a segmentation fault. As in you've broken the given bounds of memory that the you were initially allocated. The stack of of finite size, and you have exceeded it. You can read more about it at wikipedia
Additionally, one thing I've done for projects in the past is write my own signal handler to segfault (look at man page signal (2)). I usually caught the signal and wrote out "Fatal error has occured" to the console. I did some further stuff with checkpoint flags, and debugging.
In order to debug segfaults you can run a program in GDB. For example, the following C program will segfault:
#segfault.c
#include
#include
int main()
{
printf("Starting\n");
void *foo=malloc(1000);
memcpy(foo, 0, 100); //this line will segfault
exit(0);
}
If I compile it like so:
gcc -g -o segfault segfault.c
and then run it like so:
$ gdb ./segfault
GNU gdb 6.7.1
Copyright (C) 2007 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...
Using host libthread_db library "/lib/libthread_db.so.1".
(gdb) run
Starting program: /tmp/segfault
Starting
Program received signal SIGSEGV, Segmentation fault.
0x4ea43cbc in memcpy () from /lib/libc.so.6
(gdb) bt
#0 0x4ea43cbc in memcpy () from /lib/libc.so.6
#1 0x080484cb in main () at segfault.c:8
(gdb)
I find out from GDB that there was a segmentation fault on line 8. Of course there are more complex ways of handling stack overflows and other memory errors, but this will suffice.
Simply use Valgrind. It will point out all your memory allocation mistakes with excruciating preciseness.
A stack overflow does not necessarily yield a crash. It may silently trash data of your program but continue to execute.
I wouldn't use SIGSEGV handler kludges but instead fix the original problem.
If you want automated help, you can use gcc's -Wstack-protector option, which will spot some overflows at runtime and abort the program.
valgrind is good for dynamic memory allocation bugs, but not for stack errors.
Some of the comments are helpful, but the problem is not of memory allocation errors. That is there is no mistake in the code. It's quite a nuisance in fortran where the runtime allocates temporary values on the stack. Thus a command such as
write(fp)x,y,z
can trigger are segfault with no warning. The technical support for the intel Fortran compiler say that there is no way that the runtime library can print a more helpful message. However if Miguel is right than this should be possible as he suggests. So thanks a lot. The remaining question then is how do I firstly find the address of the seg fault and the figure out if it came from a stack overflow or some other problem.
For others who find this problem there is a compiler flag which puts temporary varibles above a certain size on the heap.

Resources