My strace program gave wrong output under 64bit linux system - linux

I am building an online judge system. The key point is to trace system calls.
I choose ptrace. The subprocess will stopped because of SIGTRAP when it's going to enter one system call, then the parent process goes to read the orig_rax (orig_eax)register of subprocess to get the system call number.
When the code is running under Opensuse13.1 32bit, no problem, its output is the same as the linux command strace.
But I just test the code under Opensuse13.1 64bit, Ubuntu12.04 64bit, its output is wrong.
The demo code can be downloaded here : https://gist.github.com/kainwen/41a7bd0198099d766bda
Under 64bit linux system, you save the code strace_ls.c, compile it gcc strace_ls.c, and run it./a.out 2>out. The output is here
So my code's output is strange.
My code and its output is below:
#include <sys/resource.h>
#include <unistd.h>
#include <sys/ptrace.h>
#include <sys/reg.h>
#include <sys/user.h>
#include <sys/syscall.h>
#include <sys/wait.h>
#include <sys/types.h>
#include <signal.h>
#include <stdio.h>
void judge_exe()
{
pid_t pid ;
int insyscall = 0 ;
struct user context ;
pid = fork() ;
if (pid == 0) { //child
ptrace(PTRACE_TRACEME,0,NULL,NULL) ;
execl("/usr/bin/ls","ls",NULL) ;
}
else {//parent
int status ;
ptrace(PTRACE_SYSCALL, pid, NULL, NULL);
while (1) {
wait(&status) ;
if (WIFEXITED(status)) //normally terminated
break;
else if (WIFSTOPPED(status) && (WSTOPSIG(status)==SIGTRAP)) {
if (!insyscall) {
insyscall = 1;
ptrace(PTRACE_GETREGS,pid,NULL,&context.regs);
fprintf(stderr,"syscall num: %d \n",context.regs.orig_rax);
} else
insyscall = 0;
}
ptrace(PTRACE_SYSCALL,pid,NULL,NULL);
}
return;
}
}
int main(int argc, char *argv[]) {
judge_exe();
}
The output of the code above is (just heading lines):
syscall num: 59
syscall num: 12
syscall num: 9
syscall num: 21
syscall num: 2
syscall num: 4
The heading lines of output of the linux command strace ls is:
execve("/usr/bin/ls", ["ls"], [/* 101 vars */]) = 0
brk(0) = 0x1bab000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7efff4326000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
open("/usr/lib64/mpi/gcc/openmpi/lib64/tls/x86_64/libselinux.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/mpi/gcc/openmpi/lib64/tls/x86_64", 0x7fff00c49b80) = -1 ENOENT (No such file or directory)
The number of system call sys_execve is 11, not 59 (the output of my code)
Solved
My method is right. The syscall number is different under 64bit and 32bit.

Related

How to tell if O_DIRECT is in use?

I'm running an IO intensive process that supports O_DIRECT. Is there a way to tell if O_DIRECT is being used while the process is running?
I tried "iostat -x 1" but I'm not sure which field would help me.
Thanks.
You will have to get the pid of the running process. Once you get the pid, you can do
cat /proc/[pid]/fdinfo/<fd number>
You will aslo have to know the fd number of the file being opened.
It will show flags field. The flags field is octal value displaying the flags passed to open the file descriptor fd. You will have to examine it to know whether O_DIRECT is set or not.
As an example, on my ubuntu machine(X86_64), I created 2 files - foo1 & foo2
touch foo1 foo2
and then opened foo1 with O_DIRECT and foo2 without O_DIRECT. Below is the program
#define _GNU_SOURCE
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
int main()
{
printf("%u\n", getpid());
int fd1 = open("foo1", O_RDWR|O_DIRECT); //O_DIRECT set
printf("foo1: %d\n", fd1);
int fd2 = open("foo2", O_RDWR); //Normal
printf("foo2: %d\n", fd2);
sleep(60);
close(fd1);
close(fd2);
return 0;
}
On running this I got the output:
8885
foo1: 3 //O_DIRECT
foo2: 4
8885 is the pid. So I did
cat /proc/8885/fdinfo/3 //O_DIRECT
pos: 0
flags: 0140002
mnt_id: 29
-------------------------------
cat /proc/8885/fdinfo/4
pos: 0
flags: 0100002
mnt_id: 29
From the above output you can see that for O_DIRECT, in the flags field 0040000 is also set.

second shm_open() fails with ENOENT

mode_t mode = S_IRWXU | S_IRWXG | S_IRWXO;
shm_fd = shm_open("/ipc_shm", O_CREAT | O_RDWR, mode);
This works, returns 4 for shm_fd. The same process then calls a library function that also calls
fd = shm_open("/ipc_shm", O_RDWR, 0);
This one fails with errno set to 2, i.e. ENOENT (No such file or directory). There is no shm_unlink call in the middle. Any idea why the second call is failing. Appreciate your help.
my test.c:
#include <sys/mman.h>
#include <sys/stat.h> /* For mode constants */
#include <fcntl.h> /* For O_* constants */
int main (int argc, char *argv[])
{
mode_t mode = S_IRWXU | S_IRWXG | S_IRWXO;
int shm_fd = shm_open("/ipc_shm", O_CREAT | O_RDWR, mode);
int fd = shm_open("/ipc_shm", O_RDWR, 0);
return 0;
}
compiled with gcc test.c -Wall -lrt works as expected:
$strace ./a.out
....
statfs("/dev/shm/", {f_type=0x1021994, f_bsize=4096, f_blocks=22290, f_bfree=22290, f_bavail=22290, f_files=55725, f_ffree=55723, f_fsid={0, 0}, f_namelen=255, f_frsize=4096}) = 0
futex(0xb6f5d1c0, FUTEX_WAKE_PRIVATE, 2147483647) = 0
open("/dev/shm/ipc_shm", O_RDWR|O_CREAT|O_NOFOLLOW|O_CLOEXEC, 0777) = 3
fcntl64(3, F_GETFD) = 0x1 (flags FD_CLOEXEC)
open("/dev/shm/ipc_shm", O_RDWR|O_NOFOLLOW|O_CLOEXEC) = 4
exit_group(0)
Please run strace on your application, and search for all occurrences of ipc, and maybe chroot(). Maybe something unlinks the file?

mknod() not creating named pipe

I'm trying to create a FIFO named pipe using the mknod() command:
int main() {
char* file="pipe.txt";
int state;
state = mknod(file, S_IFIFO & 0777, 0);
printf("%d",state);
return 0;
}
But the file is not created in my current directory. I tried listing it by ls -l . State returns -1.
I found similar questions here and on other sites and I've tried the solution that most suggested:
int main() {
char* file="pipe.txt";
int state;
unlink(file);
state = mknod(file, S_IFIFO & 0777, 0);
printf("%d",state);
return 0;
}
This made no difference though and the error remains. Am I doing something wrong here or is there some sort of system intervention which is causing this problem?
Help.. Thanks in advance
You are using & to set the file type instead of |. From the docs:
The file type for path is OR'ed into the mode argument, and the
application shall select one of the following symbolic
constants...
Try this:
state = mknod(file, S_IFIFO | 0777, 0);
Because this works:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
int main() {
char* file="pipe.txt";
int state;
unlink(file);
state = mknod(file, S_IFIFO | 0777, 0);
printf("state %d\n", state);
return 0;
}
Compile it:
gcc -o fifo fifo.c
Run it:
$ strace -e trace=mknod ./fifo
mknod("pipe.txt", S_IFIFO|0777) = 0
state 0
+++ exited with 0 +++
See the result:
$ ls -l pipe.txt
prwxrwxr-x. 1 lars lars 0 Jul 16 12:54 pipe.txt

Detach a linux process from pseudo-tty, but keep the tty running?

I want to debug a console linux application with 2 xterm windows: one window used for gdb and another used for the application (e.g. mc).
What I do now is run 'tty && sleep 1024d' in the second xterm window (this gives me its pseudo-tty name) and then run 'tty ' in gdb to redirect the program to that other xterm window. However, GDB warns that it cannot set a controlling terminal and certain minor functions don't work (e.g. handling window resizing), as 'sleep 1024d' is still running on that xterm window.
Any better way to do it (rather than launching the process from the shell and attaching to it from gdb)?
I have somewhat modified the program given in a related bug to store the filename somewhere
http://sourceware.org/bugzilla/show_bug.cgi?id=11403
here is an example using it:
$ xterm -e './disowntty ~/tty.tmp' & sleep 1 && gdb --tty $(cat ~/tty.tmp) /usr/bin/links
/* tty;exec disowntty */
#include <sys/ioctl.h>
#include <unistd.h>
#include <stdio.h>
#include <limits.h>
#include <stdlib.h>
#include <signal.h>
static void
end (const char *msg)
{
perror (msg);
for (;;)
pause ();
}
int
main (int argc, const char *argv[])
{
FILE *tty_name_file;
const char *tty_filename;
if (argc <= 1)
return 1;
else
tty_filename = argv[1];
void (*orig) (int signo);
setbuf (stdout, NULL);
orig = signal (SIGHUP, SIG_IGN);
if (orig != SIG_DFL)
end ("signal (SIGHUP)");
/* Verify we are the sole owner of the tty. */
if (ioctl (STDIN_FILENO, TIOCSCTTY, 0) != 0)
end ("TIOCSCTTY");
printf("%s %s\n", tty_filename, ttyname(STDIN_FILENO));
tty_name_file = fopen(tty_filename, "w");
fprintf(tty_name_file, "%s\n", ttyname(STDIN_FILENO));
fclose(tty_name_file);
/* Disown the tty. */
if (ioctl (STDIN_FILENO, TIOCNOTTY) != 0)
end ("TIOCNOTTY");
end ("OK, disowned");
return 1;
}

segfault on write() with ~8MB buffer (OSX, Linux)

I was curious what kind of buffer sizes write() and read() could handle on Linux/OSX/FreeBSD, so I started playing around with dumb programs like the following:
#include <unistd.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <sys/stat.h>
int main( void ) {
size_t s = 8*1024*1024 - 16*1024;
while( 1 ) {
s += 1024;
int f = open( "test.txt", O_CREAT | O_WRONLY | O_TRUNC, S_IRUSR | S_IWUSR | S_IXUSR );
char mem[s];
size_t written = write( f, &mem[0], s );
close( f );
printf( "(%ld) %lu\n", sizeof(size_t), written );
}
return 0;
}
This allowed me to test how close to a seeming "8MB barrier" I could get before segfaulting. Somewhere around the 8MB mark, my program dies, here's an example output:
(8) 8373248
(8) 8374272
(8) 8375296
(8) 8376320
(8) 8377344
(8) 8378368
(8) 8379392
(8) 8380416
(8) 8381440
(8) 8382464
Segmentation fault: 11
This is the same on OSX and Linux, however my FreeBSD VM is not only much faster at running this test, it also can go on for quite a ways! I've successfully tested it up to 511MB, which is just a ridiculous amount of data to write in one call.
What is it that makes the write() call segfault, and how can I figure out the maximum amount that I can possibly write() in a single call, without doing something ridiculous like I'm doing right now?
(Note, all three operating systems are 64-bit, OSX 10.7.3, Ubuntu 11.10, FreeBSD 9.0)
The fault isn't within write(), it's a stack overflow. Try this:
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <sys/stat.h>
int main( void )
{
void *mem;
size_t s = 512*1024*1024 - 16*1024;
while( 1 )
{
s += 1024;
int f = open( "test.txt", O_CREAT | O_WRONLY | O_TRUNC, S_IRUSR | S_IWUSR | S_IXUSR );
mem = malloc(s);
size_t written = write( f, mem, s );
free(mem);
close( f );
printf( "(%ld) %lu\n", sizeof(size_t), written );
}
return 0;
}

Resources