gdb stops in a command file if there is an error. How to continue despite the error? - linux

I my real gdb script while analyzing a core file I try to dereference a pointer and get "Error in sourced command file: Cannot access memory at address " and then my gdb script stops. What I want is just to go on executing my gdb script without stopping. Is it possible?
This is a test program and a test gdb script that demonstrates my problem. In this situation the pointer has NULL value but in a real situation the pointer will like have not null invalid value.
This is test C program:
#include <stdio.h>
struct my_struct {
int v1;
int v2;
};
int main()
{
my_struct *p;
printf("%d %d\n", p->v1, p->v2);
return 0;
}
This is a test gdb script:
>cat analyze.gdb
p p->v1
q
And this is demonstration of the problem (what I want from gdb here is to get this error message and then go process quit command):
>gdb -silent a.out ./core.22384 -x ./analyze.gdb
Reading symbols from /a.out...done.
[New Thread 22384]
Core was generated by `./a.out'.
Program terminated with signal 11, Segmentation fault.
#0 0x0000000000400598 in main () at main.cpp:11
11 printf("%d %d\n", p->v1, p->v2);
./analyze.gdb:1: Error in sourced command file:
Cannot access memory at address 0x0
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.80.el6.x86_64
Update
Thanks to Tom. This is a gdb script that handles this problem:
>cat ./analyze.v2.gdb
python
def my_ignore_errors(arg):
try:
gdb.execute("print \"" + "Executing command: " + arg + "\"")
gdb.execute (arg)
except:
gdb.execute("print \"" + "ERROR: " + arg + "\"")
pass
my_ignore_errors("p p")
my_ignore_errors("p p->v1")
gdb.execute("quit")
This is how it works:
>gdb -silent ./a.out -x ./analyze.v2.gdb -c ./core.15045
Reading symbols from /import/home/a.out...done.
[New Thread 15045]
Core was generated by `./a.out'.
Program terminated with signal 11, Segmentation fault.
#0 0x0000000000400598 in main () at main.cpp:11
11 printf("%d %d\n", p->v1, p->v2);
$1 = "Executing command: p p"
$2 = (my_struct *) 0x0
$3 = "Executing command: p p->v1"
$4 = "ERROR: p p->v1"
$5 = "Executing command: quit"

gdb's command language doesn't have a way to ignore an error when processing a command.
This is easily done, though, if your gdb was built with the Python extension. Search for the "ignore-errors" script. With that, you can:
(gdb) ignore-errors print *foo
... and any errors from print will be shown but not abort the rest of your script.

You can also do this:
gdb a.out < analyze.v2.gdb
This will execute the commands in analyze.v2.gdb line by line, even if an error occurs.

If you just want to exit if any error occurs, you can use the -batch gdb option:
Run in batch mode. Exit with status 0 after processing all the command
files specified with ā€˜-xā€™ (and all commands from initialization files,
if not inhibited with ā€˜-nā€™). Exit with nonzero status if an error
occurs in executing the GDB commands in the command files. [...]

Related

Why does system call fails?

I try to compile simple utility (kernel 3.4.67), which
all it does is trying to use system call very simply as following:
int main(void)
{
int rc;
printf("hello 1\n");
rc = system("echo hello again");
printf("system returned %d\n",rc);
rc = system("ls -l");
printf("system returned %d\n",rc);
return 0;
}
Yet, system call fails as you can see in the following log:
root#w812a_kk:/ # /sdcard/test
hello 1
system returned 32512
system returned 32512
I compile it as following:
arm-linux-gnueabihf-gcc -s -static -Wall -Wstrict-prototypes test.c -o test
That's really wierd becuase I used system in past in different linux and never had any issue with it.
I even tried another cross cpompiler but I get the same failure.
Version of kernel & cross compiler:
# busybox uname -a
Linux localhost 3.4.67 #1 SMP PREEMPT Wed Sep 28 18:18:33 CST 2016 armv7l GNU/Linux
arm-linux-gnueabihf-gcc --version
arm-linux-gnueabihf-gcc (crosstool-NG linaro-1.13.1-4.7-2013.03-20130313 - Linaro GCC 2013.03) 4.7.3 20130226 (prerelease)
EDIT:
root#w812a_kk:/ # echo hello again && echo $? && echo $0
hello again
0
tmp-mksh
root#w812a_kk:/ #
But I did find something interesting:
On calling test_expander() withing the main, it works OK. so I suspect that maybe system call try to find a binary which is not founded ?
int test_expander(void)
{
pid_t pid;
char *const parmList[] = {"/system/bin/busybox", "echo", "hello", NULL};
if ((pid = fork()) == -1)
perror("fork error");
else if (pid == 0) {
execv("/system/bin/busybox", parmList);
printf("Return not expected. Must be an execv error.n");
}
return 0;
}
Thank you for any idea.
Ran
The return value of system(), 32512 decimal, is 7F00 in hex.
This value is strangely similar to 0x7F, which is the result of system() if /bin/sh can not be executed. It seems there is some problem with byte ordering (big/little endian). Very strange.
Update: while writing the answer, you edited the question and pulled in something about /system/bin/busybox.
Probably you simply don't have /bin/sh.
I think I understand what happens
From system man page:
The system() library function uses fork(2) to create a child process
that executes the shell command specified in command using execl(3)
as follows:
execl("/bin/sh", "sh", "-c", command, (char *) 0);
But in my filesystem sh is founded only /system/bin , not in /bin
So I better just use execv instead. (I can't do static link becuase it's read-only filesystem)
Thanks,
Ran

How can I continue sending to stdin after input from bash process substitution finishes?

I'm using gdb.
I run a command like the below to set up the program by sending it input to stdin:
r < <(python -c "print '1\n2\n3'")
I want that command to allow me to start typing input after it finishes (so I can interact with the debugee normally) instead of stdin being closed.
This would work in bash but you can't pipe to the gdb r command this way:
cat <(python -c "print '1\n2\n3'") - | r
The below doesn't work, I assume it waits for EOF before it sends it to the program.
r < <(cat <(python -c "print '1\n2\n3'") -)
Is there a third option that will work?
This sounds like a job for expect.
Given
#include <stdio.h>
int main()
{
char *cp = NULL;
size_t n = 0;
while(getline(&cp, &n, stdin) >= 0) {
fprintf(stderr, "got: %s", cp);
}
return 0;
}
gcc -g -Wall t.c
And this expect script:
#!/usr/bin/expect
spawn gdb -q ./a.out
send run\n
send 1\n2\n3\n
interact
Here is the session:
$ ./t.exp
spawn gdb -q ./a.out
run
1
2
3
Reading symbols from ./a.out...done.
(gdb) run
Starting program: /tmp/a.out
got: 1
got: 2
got: 3
Now the script is waiting for my input. I provide some:
foo bar baz
got: foo bar baz
I can also interact with GDB:
^C
Program received signal SIGINT, Interrupt.
0x00007ffff7b006b0 in __read_nocancel () at ../sysdeps/unix/syscall-template.S:81
81 ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) bt
#0 0x00007ffff7b006b0 in __read_nocancel () at ../sysdeps/unix/syscall-template.S:81
#1 0x00007ffff7a8f5a0 in _IO_new_file_underflow (fp=0x7ffff7dd4640 <_IO_2_1_stdin_>) at fileops.c:613
#2 0x00007ffff7a840d5 in _IO_getdelim (lineptr=0x7fffffffdda0, n=0x7fffffffdda8, delimiter=10, fp=0x7ffff7dd4640 <_IO_2_1_stdin_>) at iogetdelim.c:77
#3 0x000000000040064e in main () at t.c:9

Interactively toggle output on and off in Linux?

For controlling output on Linux there is control-s and control-t, which provides a method for temporarily halting terminal output and then resuming it. On VMS in addition there was control-O, which would toggle all output on and off. This didn't pause output, it discarded it.
Is there an equivalent keyboard shortcut in Linux?
This comes up most often for me in gdb, when debugging programs which output millions of status lines. It would be very convenient to be able to temporarily send most of that to /dev/null rather than the screen, and then pick up with the output stream further on, having dispensed with a couple of million lines in between.
(Edited: The termios(3) man page mentions VDISCARD - and then says that it isn't going to work in POSIX or Linux. So it looks like this is out of the question for general command line use on linux. gdb might still be able to discard output though, through one of its own commands. Can it?)
Thanks.
On VMS in addition there was control-O ...
This functionality doesn't appear to exist on any UNIX system I've ever dealt with (or maybe I just never knew it existed; it's documented in e.g. FreeBSD man page, and is referenced by Solaris and HP-UX docs as well).
gdb might still be able to discard output though, through one of its own commands. Can it?
I don't believe so: GDB doesn't actually intercept the output from the inferior (being debugged) process, it simply makes it run (between breakpoints) with the inferior output going to wherever it's going.
That said, you could do it yourself:
#include <stdio.h>
int main()
{
int i;
for (i = 0; i < 1000; ++i) {
printf("%d\n", i);
}
}
gcc -g foo.c
gdb -q ./a.out
(gdb) break 6
Breakpoint 1 at 0x40053e: file foo.c, line 6.
(gdb) run 20>/dev/null # run the program, file descriptor 20 goes to /dev/null
Starting program: /tmp/a.out 20>/dev/null
Breakpoint 1, main () at foo.c:6
6 printf("%d\n", i);
(gdb) c
Continuing.
0
Breakpoint 1, main () at foo.c:6
6 printf("%d\n", i);
We've now run two iterations. Let's prevent further output for 100 iterations:
(gdb) call dup2(20, 1)
$1 = 1
(gdb) ign 1 100
Will ignore next 100 crossings of breakpoint 1.
(gdb) c
Continuing.
Breakpoint 1, main () at foo.c:6
6 printf("%d\n", i);
(gdb) p i
$2 = 102
No output, as desired. Now let's restore output:
(gdb) call dup2(2, 1)
$3 = 1
(gdb) ign 1 10
Will ignore next 10 crossings of breakpoint 1.
(gdb) c
Continuing.
102
103
104
105
106
107
108
109
110
111
112
Breakpoint 1, main () at foo.c:6
6 printf("%d\n", i);
Output restored!

Suppressing the segfault signal

I am analyzing a set of buggy programs that under some test they may terminate with segfault. The segfault event is logged in /var/log/syslog.
For example the following snippet returns Segmentation fault and it is logged.
#!/bin/bash
./test
My question is how to suppress the segfault such that it does NOT appear in the system log. I tried trap to capture the signal in the following script:
#!/bin/bash
set -bm
trap "echo 'something happened'" {1..64}
./test
It returns:
Segmentation fault
something happened
So, it does traps the segfault but the segfault is still logged.
kernel: [81615.373989] test[319]: segfault at 0 ip 00007f6b9436d614
sp 00007ffe33fb77f8 error 6 in libc-2.19.so[7f6b942e1000+1bb000]
You can try to change ./test to the following line:
. ./test
This will execute ./test in the same shell.
We can suppress the log message system-wide with e. g.
echo 0 >/proc/sys/debug/exception-trace
- see also
Making the Linux kernel shut up about segfaulting user programs
Is there a way to temporarily disable segfault messages in dmesg?
We can suppress the log message for a single process if we run it under ptrace() control, as in a debugger. This program does that:
exe.c
#include <sys/wait.h>
#include <sys/ptrace.h>
main(int argc, char *args[])
{
pid_t pid;
if (*++args)
if (pid = fork())
{
int status;
while (wait(&status) > 0)
{
if (!WIFSTOPPED(status))
return WIFSIGNALED(status) ? 128+WTERMSIG(status)
: WEXITSTATUS(status);
int signal = WSTOPSIG(status);
if (signal == SIGTRAP) signal = 0;
ptrace(PTRACE_CONT, pid, 0, signal);
}
perror("wait");
}
else
{
ptrace(PTRACE_TRACEME, 0, 0, 0);
execvp(*args, args);
perror(*args);
}
return 1;
}
It is called with the buggy program as its argument, in your case
exe ./test
- then the exit status of exe normally is the exit status of test, but if test was terminated by signal n (11 for Segmentation fault), it is 128+n.
After I wrote this, I realized that we can also use strace for the purpose, e. g.
strace -enone ./test

Resume a stopped process through command line

I've executed the following C code in Linux CentOS to create a process.
#include <stdio.h>
#include <unistd.h>
int main ()
{
int i = 0;
while ( 1 )
{
printf ( "\nhello %d\n", i ++ );
sleep ( 2 );
}
}
I've compiled it to hello_count. When I do ./hello_count, The output is like this:
hello 0
hello 1
hello 2
...
till I kill it.
I've stopped the execution using the following command
kill -s SIGSTOP 2956
When I do
ps -e
the process 2956 ./hello_count is still listed.
Is there any command or any method to resume (not to restart) the process having process id 2956?
Also, when I stop the process, the command line shows:
[1]+ Stopped ./hello_count
What does the [1]+ in the above line mean?
To continue a stopped process, that is resume use kill -SIGCONT PID
Regd [1]+ that is bash way of handling jobs. For further information try help jobs from bash prompt.

Resources