gdb backtrace by walking frame pointers - linux

Sometimes there is some small stack corruption that causes gdb to fail doing a "backtrace", I have created the below gdb macro (x86-64, can be easily made to work for x86) that depends on turning off omit-frame-pointer (i.e. -fno-omit-frame-pointer) and shows me the functions in the backtrace. However, I'd like it to also show parameter values and ideally be able to select one of these frames. (i.e. something such as "frame 0x0123456789ABCDEF").
define et
set $frameptr = $rbp
while $frameptr != 0
set $oldbp = *((void**)($frameptr+8))
print $frameptr
print $oldbp
info symbol $oldbp
set $frameptr = *((void**)($frameptr))
end
end

Related

Why do I need to call `clear` after `initscr` with ncurses?

I have solved this problem in the sense I have code that does what I want, but I don't understand why it is necessary to do what I do, and I cannot see this behaviour documented, so I wonder if someone could explain why?
I am actually "porting" ncurses to Forth. More correctly I am writing some RISC-V assembly that lightly wraps around the C library calls to give Forth users an interface to ncurses.
This is what the code looks like in Forth:
: hello S" Hello, World!" ;
: goodbye S" Goodbye, World! " ;
: garish 1 color_red color_yellow init_pair ;
: cool 2 color_cyan color_black init_pair ;
: doit initscr clear start_color garish cool 1 color_pair attron 10 10 movew hello drop printw refresh getch 1 color_pair attroff ;
: endit 2 color_pair attron 30 30 movew goodbye drop printw refresh getch 2 color_pair attroff endwin ;
doit
endit
For those who don't know Forth this defines a few words then calls the doit and endit words to actually execute the program. But this isn't a question about Forth but about why I need to call clear (which is simply a wrapper around a call to the ncurses clear call) after I call initscr (again just a simple wrapper around a call to the ncurses function).
The assembly is shown below. 'CODEEND' and 'CODEHEADER' are macros that generate Forth function headings and 'TAILMOD' generates the return to the loop code - but the key point is you can see these are very light wrappers around the C calls:
CODEEND INITSCR, 0x01
#(--)
call initscr
la t0, STDSCR #store stdscr
sd a0, 0(t0)
TAILMOD t1
CODEHEADER CLEAR, INITSCR, 0x01
#( -- )
call clear
TAILMOD t1
If I don't call clear, then on every subsequent call of the doit endit pair I just see the previous output - i.e. initscr has not cleared the screen without this explicit call to clear.
Am I right in thinking that initscr should normally clear the screen on every invocation (at the next call of refresh)?
Is it just a function of my terminal type (this is a tty over ssh in this case) that I have to call clear or is there something else at work here?
Adding Just to be totally clear, I know the documentation says a call to refresh is required to clear the screen and I have done that but it doesn't work - that is implicit in the code I've posted where there is an explicit refresh but I am adding this to make it obvious to people who maybe don't grok Forth. (Originally where the clear is there was a refresh but it does not clear the screen on subsequent calls.)
Per manpage:
initscr also causes the first call to refresh(3x) to clear the screen.
You won't see any change on the screen until that refresh. Some functions (such as getch) do a refresh as a side-effect.

What does the linux page address '0xdead0000~~' mean?

I used kgdb to debug linux kernel and print *page.
The result shows some addresses started with '0xdead'
Like, {lru = {next = 0xdead000000000100, prev = 0xdead000000000122},
0xdead pages
What those pages mean? a NULL page? or something meaningful?
Thank you.
These pointers are poison constants. They are set to next and previous when a list entry is removed from the list. See list_del().

How can I find the pages that belong to the heap in a linux process?

I would like to write a simple kernel function that iterates over all the vm_area_structs that belong to a specific process and mark each one of them as belonging to the heap or not. Assume that I can add a boolean field in the vm_area_struct that will be set for heap pages and reset for other pages.
I have looked into the mm_struct, vm_area_struct, and task_struct... but found nothing that can help.
Update: I am guessing start_brk and brk have something to do with this?
(Am inserting my last comment as an answer, as the formatting within "Comment" is not that great):
Wrt my prev comment: the relevant code (to look up VMAs of a given PID) seems to be here: fs/proc/task_mmu.c .
And, yes indeed, the "[heap]" is marked by this code snippet from the above src file (kernel ver 3.10.24):
*fs/proc/task_mmu.c:show_map_vma()*
...
if (vma->vm_start <= mm->brk &&
vma->vm_end >= mm->start_brk) {
name = "[heap]";
goto done; }
...

Find the value for the address in GDB (Cent OS 6)

For analysis purpose we want to know the which data(message) is stored in the address. Is there any option to find the message in GDB.
In the other words we know the address (0x80488b4) of memory but we want also know the message stored in that address through GDB.
Sample code :
(gdb) print option_value
$1 = (const void *) 0x80488b4
If you know the type typemsg_tof the message, you could dereference it, e.g. print *(typemsg_t*) option_value
You might also be interested by the GDB watchpoint ability.
It is worth taking some time to read GDB documentation !
What is "the message"? You can of course examine the contents of memory at that address, using gdb's x (for examine) command:
(gdb) x option_value
If you know that option_value, despite looking like a const void * in the current scope, is really of some other type, you can cast and dereference:
(gdb) print *(MessageType *) option_value

fopen crashes only when running from release executable

I make several calls to a function that reads data from an input file. Everything works fine in debug mode, but when I try to run the executable from release mode, the line with fopen crashes the program after a few calls. My code is:
From header file:
#define presstankdatabase "presst_database.txt"
In function:
FILE *fidread;
fidread = fopen(presstankdatabase,"r");
if (fidread==NULL) {
printf("Failed to open pressurant tank database: %s\n",presstankdatabase);
return 1;
}
In debugging, I've inserted comment lines just before and just after the line starting with fidread =, and after several calls the program crashes and I get the message "A problem caused the program to stop working correctly. Please close the program." The comment just before the fopen call is displayed, but the comment just after does not. My understanding of fopen is that is should return either a pointer or NULL, but it crashes before it even gets to the check. The only thing I can think of is that somehow I'm having memory problems, but I don't know how that would fit in with fopen crashing. Does anyone know what might be going on? Thanks!
EDIT 1: I increased the size of three variables, and the only places they're used (except in printf() calls), are as shown below.
char *constid = (char*)malloc(sizeof(char)*20);
Used like so:
strcpy(constid,"Propellant");
strcpy(constid,"Propellant tank");
strcpy(constid,"Pressurant tank");
If the variables are sized to 20, as shown above, it crashes. But if they're larger (I've tried 120 and 100), the program runs. The variables aren't used in any other places other than fprintf() or printf() calls.
presstankdatabase should be a pointer to a string containing the filename to open. If fopen() crashes then that pointer is probably invalid (or NULL). Without any more code it is not possible to debug it further. Use the VC debugger to see what's happening...
EDIT:
Another common cause of this is a filename string that suddenly stops being NULL-terminated.
You should add a printf() call to print the filename before opening. It will most probably fail to produce the expected output. If not, then you have a more interesting form of memory corruption that will take some more work to weed out.
EDIT 2:
If the printf() call shows the correct string, then you probably have memory corruption somewhere else in your code that has mangled some internal structure of the C library. A common cause is going beyond the end (or the beginning for that matter) of a static array or a region provided by malloc().

Resources