Can I instruct gdb to run commands in response to SIGTRAP? - linux

I'm debugging a reference leak in a GObject-based application. GObject has a simple built-in mechanism to help with such matters: you can set the g_trap_object_ref variable in gobject.c to the object that you care about, and then every ref or unref of that object will hit a breakpoint instruction (via G_BREAKPOINT()).
So sure enough, the program does get stopped, with gdb reporting:
Program received signal SIGTRAP, Trace/breakpoint trap.
g_object_ref (_object=0x65f090) at gobject.c:2606
2606 old_val = g_atomic_int_exchange_and_add ((int *)&object->ref_count, 1);
(gdb) _
which is a great start. Now, normally I'd script some commands to be run at a breakpoint I manually set using commands 3 (for breakpoint 3, say). But the equivalent for SIGTRAP, namely handle SIGTRAP, doesn't give me the option of doing anything particularly interesting. Is there a good way to do this?
(I'm aware that there are other ways to debug reference leaks, such as setting watchpoints on the object's ref_count field, refdbg, scripting regular breakpoints on g_object_ref() and g_object_unref(). I'm about to go try of those now. I'm looking specifically for a way to script a response to SIGTRAP. It might come in useful in other situations, too, and I'd be surprised if gdb doesn't support this.)

Do you want to show some values and continue execution of the program? In that case, just define a macro that displays the values you're interested in, continues execution and calls itself recursively:
define c
echo do stuff\n
continue
c
end

GDB doesn't support it.
In general, attaching a command script to signal makes little sense -- your program could be receiving SIGTRAP in any number of places, and the command will not know whether a particular SIGTRAP came in in expected context or not.

Related

How can I find a reason that a program stalls?

I'm handling a program which controls a car.
The program has a pretty large scale and it was made by other people.
So I don't understand completely how it works.
But I have to apply it and make a car moving.
The problem I'm facing is that the program often stalls with no error, no segmentation.
If it crashes, I can trace the cause with gdb or something like that.
But it does not crash, it silently stops.
How can I find the cause?
From your description - program silently stops - I understand that your program simply and gracefully exited, but not from your expected flow. This can happen for many reasons - for example, maybe your program enters illegal flow and some sub-component, such as standard library or other library decide that program should exit, and thus calls c-runtime exit() or directly calls Kernel32!ExitProcess().The best way to debug this flow is to attach a debugger and set a breakpoint on these two methods and find out who is calling them.If you mean your program enters a deadlock and halts then also you will need to attach a debugger and find out who is stuck.

What mechanism does gdb use to know where to "finish" a function call?

In gdb, when debugging inside a function, we can use "finish" command to run to the end of a function.
My question is: how does gdb know the ending position of a function, especially when there's no debugging symbol to match source code "{}"?
I guess gdb looks for either "leave" or "mov %rbp, %rsp,pop %rbp" under x86 in order to judge whether it has reached the end of a function.
But the problem is,
(1) There're still some extra registers that needs to push/pop at the begin/end of a function call, depending on source code and ABI structure.
(2)The number of registers needs to be push/pop is decided during compilation phase, and I'm afraid this "number" information is not available throw binary executable file.
So, how does gdb determine, where is the end of a function call, so that "finish" command can jump to it?
Thanks!
gdb doesn't try to analyze the machine code. Instead, it unwinds the stack, finds the caller's PC, and sets a temporary breakpoint there. Then it lets the inferior run until the breakpoint is hit.
Due to the way gdb's unwinder is designed, this automatically handles finish from an inlined function as well (though there are still a few special cases in the code due to this).

view output of already running processes in linux

I have a process that is running in the background (sh script) and I wonder if it is possible to view the output of this process without having to interrupt it.
The process ran by some application otherwise I would have attached it to a screen for later viewing. It might take an hour to finish and i want to make sure it's running normally with no errors.
There is already an program that uses ptrace(2) in linux to do this, retty:
http://pasky.or.cz/dev/retty/
It works if your running program is already attached to a tty, I do not know if it will work if you run your program in background.
At least it may give some good hints. :)
You can probably retreive the exit code from the program using ptrace(2), otherwise just attach to the process using gdb -p <pid>, and it will be printed when the program dies.
You can also manipulate file descriptors using gdb:
(gdb) p close(1)
$1 = 0
(gdb) p creat("/tmp/stdout", 0600)
$2 = 1
http://etbe.coker.com.au/2008/02/27/redirecting-output-from-a-running-process/
You could try to hook into the /proc/[pid]/fd/[012] triple, but likely that won't work.
Next idea that pops to my mind is strace -p [pid], but you'll get "prittified" output. The possible solution is to strace yourself by writing a tiny program using ptrace(2) to hook into write(2) and writing the data somewhere. It will work but is not done in just a few seconds, especially if you're not used to C programming.
Unfortunately I can't think of a program that does precisely what you want, which is why I give you a hint of how to write it yourself. Good luck!

Getting a backtrace of other thread

In Linux, to get a backtrace you can use backtrace() library call, but it only returns backtrace of current thread. Is there any way to get a backtrace of some other thread, assuming I know it's TID (or pthread_t) and I can guarantee it sleeps?
It seems that libunwind (http://www.nongnu.org/libunwind/) project can help. The problem is that it is not supported by CentOS, so I prefer not to use it.
Any other ideas?
Thanks.
I implemented that myself here.
Initially, I wanted to implement something similar as suggested here, i.e. getting somehow the top frame pointer of the thread and unwinding it manually (the linked source is derived from Apples backtrace implementation, thus might be Apple-specific, but the idea is generic).
However, to have that safe (and the source above is not and may even be broken anyway), you must suspend the thread while you access its stack. I searched around for different ways to suspend a thread and found this, this and this. Basically, there is no really good way. The common hack, also used by the Hotspot JAVA VM, is to use signals and sending a custom signal to your thread via pthread_kill.
So, as I would need such signal-hack anyway, I can have it a bit simpler and just use backtrace inside the called signal handler which is executed in the target thread (as also suggested here by sandeep). This is basically what my implementation is doing.
If you are also interested in printing the backtrace, i.e. get some useful debugging information (function name, source code filename, source code line number, ...), read here about an extended backtrace_symbols based on libbfd. Or just see the source here.
Signal Handling with the help of backtrace can solve your purpose.
I mean if you have a PID of the Thread, you can raise a signal for that thread. and in the handler you can use the backtrace. since the handler would be executing in that partucular thread, the backtrace there would be the output what you are needed.
gdb provides these facilities for debugging multi-thread programs:
automatic notification of new threads
‘thread thread-id’, a command to switch among threads
‘info threads’, a command to inquire about existing threads
‘thread apply [thread-id-list] [all] args’, a command to apply a command to a list of threads
thread-specific breakpoints
‘set print thread-events’, which controls printing of messages on thread start and exit.
‘set libthread-db-search-path path’, which lets the user specify which libthread_db to use if the default choice isn't compatible with the program.
So just goto required thread in GDB by cmd: 'thread thread-id'.
Then do 'bt' in that thread context to print the thread backtrace.

invoking less application from GNU readline

Bit support question. Apologies for that.
I have an application linked with GNU readline. The application can invoke shell commands (similar to invoking tclsh using readline wrapper). When I try to invoke the Linux less command, I get the following error:
Suspend (tty output)
I'm not an expert around issues of terminals. I've tried to google it but found no answer. Does any one know how to solve this issue?
Thanks.
You probably need to investigate the functions rl_prep_terminal() and rl_deprep_terminal() documented in the readline manual:
Function: void rl_prep_terminal(int meta_flag)
Modify the terminal settings for Readline's use, so readline() can read a single character at a time from the keyboard. The meta_flag argument should be non-zero if Readline should read eight-bit input.
Function: void rl_deprep_terminal(void)
Undo the effects of rl_prep_terminal(), leaving the terminal in the state in which it was before the most recent call to rl_prep_terminal().
The less program is likely to get confused if the terminal is already in the special mode used by the Readline library and it tries to tweak the terminal into an equivalent mode. This is a common problem for programs that work with the curses library, or other similar libraries that adjust the terminal status and run other programs that also do that.
Whilst counterintuitive it may be stopped waiting for input (some OSs and shells give Stopped/Suspended (tty output) when you might expect it to refer to (tty input)). This would fit the usual behaviour of less when it stops at the end of (what it thinks is) the screen length.
Can you use cat or head instead? or feed less some input? or look at the less man/info pages to see what options to less might suit your requirement (e.g w, z, F)?
Your readline application is making itself the controlling app for your tty.
When you invoke less from inside the application, it wants to be in control of the tty as well.
If you are trying to invoke less in your application to display a file for the user,
you want to set the new fork'd process into it's own process group before calling exec.
You can do this with setsid(). Then when less call tcsetpgrpp(), it will not get
thrown into the backgroud with SIGTTOU.
When less finishes, you'll want to restore the foregroud process group with tcsetpgrp(), as well.

Resources