Process killed by SIGHUP after read returns ERESTARTSYS - linux

We have some application which calls a PHP script which connects to an Oracle DB to do certain things. :) This does not work out well sometimes.
We are now running the PHP part via strace from the beginning.
This is how it looks when everything works ok (everything works out, the DB connection is built, the query executed, the DB is again disconnected, etc.):
10:30:17.935486 connect(8, {sa_family=AF_INET, sin_port=htons(1521), sin_addr=inet_addr("10.1.1.55")}, 16) = -1 EINPROGRESS (Operation now in progress)
10:30:17.935546 times(NULL) = 2908590046
10:30:17.935569 brk(0xda4000) = 0xda4000
10:30:17.935594 poll([{fd=8, events=POLLOUT}], 1, 60000) = 1 ([{fd=8, revents=POLLOUT}])
10:30:17.940338 getsockopt(8, SOL_SOCKET, SO_ERROR, [519270883345301504], [4]) = 0
10:30:17.940368 fcntl(8, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK)
10:30:17.940388 fcntl(8, F_SETFL, O_RDWR) = 0
10:30:17.940408 getsockname(8, {sa_family=AF_INET, sin_port=htons(62498), sin_addr=inet_addr("192.168.22.30")}, [16]) = 0
10:30:17.940437 getsockopt(8, SOL_SOCKET, SO_SNDBUF, [-4193870156763480064], [4]) = 0
10:30:17.940458 getsockopt(8, SOL_SOCKET, SO_RCVBUF, [-4193870156763409068], [4]) = 0
10:30:17.940483 setsockopt(8, SOL_TCP, TCP_NODELAY, [1], 4) = 0
10:30:17.940506 fcntl(8, F_SETFD, FD_CLOEXEC) = 0
10:30:17.940652 rt_sigaction(SIGPIPE, {0x1, ~[ILL ABRT BUS FPE SEGV USR2 TERM XCPU XFSZ SYS RTMIN RT_1], SA_RESTORER|SA_RESTART|SA_SIGINFO, 0x7f7198b2b920}, {0x1, [PIPE], SA_RESTORER|SA_RESTART, 0x7f7198b2b920}, 8) = 0
10:30:17.940725 write(8, "\x00\xe8\x00\x00\x01\x00\x00\x00\x01\x3b\x01\x2c\x0c\x41\x20\x00\xff\xff\x7f\x08\x00\x00\x01\x00\x00\xa2\x00\x46\x00\x00\x08\x00"..., 232) = 232
10:30:17.940781 read(8, "\x00\x08\x00\x00\x0b\x00\x00\x00", 8208) = 8
10:30:17.974177 write(8, "\x00\xe8\x00\x00\x01\x00\x00\x00\x01\x3b\x01\x2c\x0c\x41\x20\x00\xff\xff\x7f\x08\x00\x00\x01\x00\x00\xa2\x00\x46\x00\x00\x08\x00"..., 232) = 232
10:30:17.974247 read(8, "\x00\x29\x00\x00\x02\x00\x00\x00\x01\x3b\x0c\x41\x00\x00\x00\x00\x01\x00\x00\x00\x00\x29\x51\x41\x00\x00\x00\x00\x00\x00\x00\x00"..., 8208) = 41
10:30:17.976465 write(8, "\x00\x00\x00\xa4\x06\x20\x00\x00\x00\x00\xde\xad\xbe\xef\x00\x9a\x00\x00\x00\x00\x00\x04\x00\x00\x04\x00\x03\x00\x00\x00\x00\x00"..., 164) = 164
....
This is how it looks when everything does not work ok:
10:23:24.888170 connect(8, {sa_family=AF_INET, sin_port=htons(1521), sin_addr=inet_addr("10.1.1.55")}, 16) = -1 EINPROGRESS (Operation now in progress)
10:23:24.888241 times(NULL) = 2908548738
10:23:24.888263 brk(0xda4000) = 0xda4000
10:23:24.888287 poll([{fd=8, events=POLLOUT}], 1, 60000) = 1 ([{fd=8, revents=POLLOUT}])
10:23:24.889769 getsockopt(8, SOL_SOCKET, SO_ERROR, [519270883345301504], [4]) = 0
10:23:24.889807 fcntl(8, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK)
10:23:24.889827 fcntl(8, F_SETFL, O_RDWR) = 0
10:23:24.889845 getsockname(8, {sa_family=AF_INET, sin_port=htons(62473), sin_addr=inet_addr("192.168.22.30")}, [16]) = 0
10:23:24.889873 getsockopt(8, SOL_SOCKET, SO_SNDBUF, [-8374476973480591360], [4]) = 0
10:23:24.889892 getsockopt(8, SOL_SOCKET, SO_RCVBUF, [-8374476973480520364], [4]) = 0
10:23:24.889915 setsockopt(8, SOL_TCP, TCP_NODELAY, [1], 4) = 0
10:23:24.889936 fcntl(8, F_SETFD, FD_CLOEXEC) = 0
10:23:24.890062 rt_sigaction(SIGPIPE, {0x1, ~[ILL ABRT BUS FPE SEGV USR2 TERM XCPU XFSZ SYS RTMIN RT_1], SA_RESTORER|SA_RESTART|SA_SIGINFO, 0x7f2ee24b4920}, {0x1, [PIPE], SA_
RESTORER|SA_RESTART, 0x7f2ee24b4920}, 8) = 0
10:23:24.890129 write(8, "\x00\xe8\x00\x00\x01\x00\x00\x00\x01\x3b\x01\x2c\x0c\x41\x20\x00\xff\xff\x7f\x08\x00\x00\x01\x00\x00\xa2\x00\x46\x00\x00\x08\x00"..., 232) = 232
10:23:24.890186 read(8, 0xd705a6, 8208) = ? ERESTARTSYS (To be restarted)
10:23:24.907853 --- SIGHUP (Hangup) # 0 (0) ---
10:23:24.908708 +++ killed by SIGHUP +++
This happens sometimes and the application (or at least the PHP script and the connection to the DB) just gets killed. That's bad.
What do you make of the above straces?
Can we tell who is killed by who?
Why would read() return ERESTARTSYS?
What does SIGHUP (Hangup) # 0 (0) tell us exactly?

Your process got sent a SIGHUP, which caused the normal action of exiting.
Can't tell who did it. Try a newer version of strace. From what I can tell, going all the way back to version 4.6 from 2011 it should display more information. The version of strace you are using is from prior to 2011 and the # 0 (0) supplies the PC of the process when the signal was received and the address associated with the signal from siginfo_t. Neither will tell you anything about this problem.
A newer version will supply something like this:
--- SIGHUP {si_signo=SIGHUP, si_code=SI_USER, si_pid=25064, si_uid=1000} ---
--- SIGHUP {si_signo=SIGHUP, si_code=SI_KERNEL} ---
This first is another process sending the SIGHUP. The second is one sent automatically because of certain events.
The latter can happen when the controlling terminal of the process closes or when the session leader exits because its terminal closed. If you determine it's the kernel sending the signal, then I'd look at your process while it's running and examine the "sid" and "tty" columns in the ps output. That will tell you the session leader and terminal responsible for causing the SIGHUP to be sent. Maybe sometimes your script has a controlling terminal and sometimes not?
The session leader would usually be the parent process that started your script, or the parent of that process, or the parent of that, etc. Looking at ps output and "sid" will tell you. If that leader process exits and has a controlling terminal, everything under it gets a SIGHUP. The way to solve this would be either have the leader not exit until the PHP process is finished, or at some point detach from that session or terminal. Usually a daemon or server process should not associated with a terminal. See daemon() and setsid().

Related

Reading from a Closed File Descriptor

I traced open, read, close and dup system calls in gimp-2.8.22 using strace, with the following command:
strace -eread,openat,open,close,dup,dup2 gimp
In gimp, I opened an image named babr.jpg. The trace shows that this image was opened (file descriptor is 14), read and closed. But, immediately after that, the same file descriptor (14 is not opened after the last close) is used for reading. How is it possible?
Here is the relevant portion of trace:
read(14, "\371\331\25\233M\311j\261b\271\332\240\33\315d\234\340y\236\217\323\206(\214\270x2\303S\212\252\254"..., 4096) = 4096
read(14, "t\260\265fv<\243.5A\324\17\221+\36\207\265&+rL\247\343\366\372\236\353\353'\226\27\27"..., 4096) = 318
close(14) = 0
openat(AT_FDCWD, "/home/ahmad/Pictures/babr.jpg", O_RDONLY) = 14
read(14, "\377\330\377\340\0\20JFIF\0\1\1\1\1,\1,\0\0\377\355(\212Photosho"..., 4096) = 4096
close(14) = 0
openat(AT_FDCWD, "/opt/gimp-2.8.22/lib/gimp/2.0/plug-ins/file-jpeg", O_RDONLY) = 19
read(19, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P[\0\0\0\0\0\0"..., 4096) = 4096
close(19) = 0
close(20) = 0
read(19, "", 8) = 0
close(19) = 0
close(17) = 0
close(16) = 0
read(4, "\2\0\0\0\0\0\0\0", 16) = 8
Gtk-^[[1;32mMessage^[[0m: ^[[34m15:09:02.956^[[0m: Failed to load module "canberra-gtk-module"
read(14, "\0\0\0\5", 4) = 4
read(14, "\0\0\0\23", 4) = 4
read(14, "gimp-progress-init\0", 19) = 19
read(14, "\0\0\0\2", 4) = 4
I also checked this using Pin and found the same result.
The second file descriptor #14 is very likely a pipe between the plugin and Gimp (the handle being free has been reused). And you don't trace the creation of pipes.
From gimpplugin.c:
/* Open two pipes. (Bidirectional communication).
*/
if ((pipe (my_read) == -1) || (pipe (my_write) == -1))
{
gimp_message (plug_in->manager->gimp, NULL, GIMP_MESSAGE_ERROR,
"Unable to run plug-in \"%s\"\n(%s)\n\npipe() failed: %s",
gimp_object_get_name (plug_in),
gimp_file_get_utf8_name (plug_in->file),
g_strerror (errno));
return FALSE;
}

Program locks up but NOT when run through strace

I am doing server-side rendering with wkhtmltoimage and each run is locking up at "88% loading" for 1-2 minutes. I decided to debug what was happening through strace but in a completely bizarre twist the program did NOT lock up. I've found this to be completely repeatable and consistent. Why on earth would strace make a program faster when by all rights the program should be slower ?!
Run without strace:
user#server:~/public_html/shapes$ time wkhtmltoimage --disable-smart-width --width 970 --format jpg '[THE URL]' '[THE PATH].jpg'
Loading page (1/2)
Rendering (2/2)
Done
real 1m45.724s
user 1m42.887s
sys 0m0.623s
Run WITH strace:
user#server:~/public_html/shapes$ time strace wkhtmltoimage --disable-smart-width --width 970 --format jpg '[THE URL]' '[THE PATH].jpg'
execve("/usr/local/bin/wkhtmltoimage", ["wkhtmltoimage", "--disable-smart-width", "--width", "970", "--format", "jpg", "[THE URL]"..., "[THE PATH]"...], [/* 21 vars */]) = 0
brk(0) = 0x311a000
...
exit_group(0) = ?
+++ exited with 0 +++
real 0m6.526s
user 0m0.582s
sys 0m0.377s
Server is private so I've redacted the URL and PATH but otherwise the output is correct. Also both runs create the correct output and I've cleared up temporary files to ensure it's not a caching issue. I've done these runs 10 times each to ensure it's not a random artifact but it happens consistently, THEREFORE the only logical conclusion is that strace is somehow changing the behaviour of wkhtmltoimage and I was really hoping somebody could tell me what. If I knew why strace makes the program not lockup I could probably find a solution.
Here is the hung process:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
772 sigapp 20 0 1485600 45760 14524 R 99.8 2.2 0:57.02 wkhtmltoimage
As root I can attach to the hung image using strace -p $(pidof wkhtmltoimage) and the result is:
gettimeofday({1436692542, 446246}, NULL) = 0
gettimeofday({1436692542, 556958}, NULL) = 0
gettimeofday({1436692542, 557161}, NULL) = 0
gettimeofday({1436692542, 659238}, NULL) = 0
gettimeofday({1436692542, 771381}, NULL) = 0
gettimeofday({1436692542, 771686}, NULL) = 0
gettimeofday({1436692542, 875783}, NULL) = 0
gettimeofday({1436692542, 987490}, NULL) = 0
gettimeofday({1436692542, 987781}, NULL) = 0
gettimeofday({1436692543, 84764}, NULL) = 0
...
System: Linux server 3.13.0-30-generic #55-Ubuntu SMP Fri Jul 4 21:40:53 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux (Running on 2 CPU Xen Guest)
Software: wkhtmltoimage 0.12.2.1 (with patched qt) <- installed from official website via .deb file
I've read this but not sure if relevant: Hung processes resume if attached to strace
UPDATE:
Tried with another program webkit2pdf and seeing lockups as well. This time strace output is different. Both programs use webkit.:
root#server:~# strace -p `pidof webkit2pdf`
Process 4699 attached
restart_syscall(<... resuming interrupted call ...>
On a successful run of strace wkhtmltoimage ... I can see similar calls but not NEARLY as many preceeded by an mmap:
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x7f9dbbbc1000
gettimeofday({1436696839, 361942}, NULL) = 0
gettimeofday({1436696839, 362254}, NULL) = 0
gettimeofday({1436696839, 362505}, NULL) = 0
gettimeofday({1436696839, 362787}, NULL) = 0
gettimeofday({1436696839, 363172}, NULL) = 0
gettimeofday({1436696839, 363568}, NULL) = 0
gettimeofday({1436696839, 363913}, NULL) = 0
mmap(NULL, 28672, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x7f9db5000000
gettimeofday({1436696839, 364701}, NULL) = 0
mmap(NULL, 28672, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x7f9db4ff9000
gettimeofday({1436696839, 365612}, NULL) = 0
gettimeofday({1436696839, 365956}, NULL) = 0
So per pvg's comment below this is most probably the locking mechanism for setAttribute working correctly under strace vs. not working normally.

In Linux initrd image, ethernet fails to work

I am working on SABRE SD Development board, which uses i.Mx6 Quad core processor. I have developed a initrd image for this board. The kernel boots up and the initrd images is mounted successfully. Even the fec ethernet drivers are loaded properly.
But during the init process the dhcp fails to designate an ip for the ethernet device.
on analysis using the strace utility on the 'dhcp' command the following log was obtained:
In the log a select system call Timeouts causing the error. A selective portion of the log is given bellow.
socket(PF_INET, SOCK_RAW, IPPROTO_RAW) = 6
ioctl(6, SIOCGIFINDEX, {ifr_name="eth0", ifr_index=2}) = 0
ioctl(6, SIOCGIFHWADDR, {ifr_name="eth0", ifr_hwaddr=00:04:9f:02:b3:81}) = 0
close(6) = 0
clock_gettime(CLOCK_MONOTONIC, {53, 815520338}) = 0
write(1, "Sending discover...\n", 20Sending discover...
) = 20
socket(PF_PACKET, SOCK_DGRAM, 8) = 6
bind(6, {sa_family=AF_PACKET, proto=0x800, if2, pkttype=PACKET_HOST, addr(6)={0,
ffffffffffff}, 20) = 0
sendto(6, "E\0\0014\0\0\0\0#\21y\272\0\0\0\0\377\377\377\377\0D\0C\1 ,h\1\1\6\0"..., 308, 0, {sa_family=AF_PACKET, proto=0x800, if2, pkttype=PACKET_HOST, add8
close(6) = 0
fcntl64(5, F_SETFD, FD_CLOEXEC) = 0
clock_gettime(CLOCK_MONOTONIC, {53, 990583005}) = 0
select(6, [3 5], NULL, NULL, {3, 0}) = 0 (Timeout)
But when the same rootfs used in initrd image is used with SD card boot the dhcp command does not fail.
Can any one help me with some clues?
with regards,
Vivek

Dancer randomly hangs when reading GET request

I am playing with perl dancer on Linux and all is nice and dany if the browser connects to the server directly via LAN. However, when I connect via WAN and the browser is IE9, then occasionally the busy cursor won't go away.
I can provoke this, by reloading the page apx 10 times in a row. I get this problem even when I wait severall seconds between each reload. The page itself is awfully simply and passes the w3c check.
It makes no difference if I run dancer as root, or whether the port is 80 or 3000. A also tested frequent reloading of a page with apache and there does not seem to be an issue.
I ran strace and I have the impression, that the request data is sometimes not availbale at the the time dancer tries to read it. This is what the trace looks like
When it works:
{sa_family=AF_INET, sin_port=htons(52073), sin_addr=inet_addr("78.42.213.92")}, [16]) = 4
ioctl(4, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbfab5028) = -1 ENOTTY (Inappropriate ioctl for device)
_llseek(4, 0, 0xbfab5070, SEEK_CUR) = -1 ESPIPE (Illegal seek)
ioctl(4, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbfab5028) = -1 ENOTTY (Inappropriate ioctl for device)
_llseek(4, 0, 0xbfab5070, SEEK_CUR) = -1 ESPIPE (Illegal seek)
fcntl64(4, F_SETFD, FD_CLOEXEC) = 0
getpeername(4, {sa_family=AF_INET, sin_port=htons(52073), sin_addr=inet_addr("78.42.213.92")}, [16]) = 0
read(4, "G", 1) = 1
read(4, "E", 1) = 1
read(4, "T", 1) = 1
When it hangs
{sa_family=AF_INET, sin_port=htons(52225), sin_addr=inet_addr("78.42.213.92")}, [16]) = 4
ioctl(4, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbfab5028) = -1 ENOTTY (Inappropriate ioctl for device)
_llseek(4, 0, 0xbfab5070, SEEK_CUR) = -1 ESPIPE (Illegal seek)
ioctl(4, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbfab5028) = -1 ENOTTY (Inappropriate ioctl for device)
_llseek(4, 0, 0xbfab5070, SEEK_CUR) = -1 ESPIPE (Illegal seek)
fcntl64(4, F_SETFD, FD_CLOEXEC) = 0
getpeername(4, {sa_family=AF_INET, sin_port=htons(52225), sin_addr=inet_addr("78.42.213.92")}, [16]) = 0
read(4,
and then it sits forever. Any Idea what I can do?
I ran into a similar problem with IE9 connecting to a Catalyst dev server. Eric Lawrence (IE Team Lead!?) suggested it might be due to IE9's background connection feature. IE9 opens a background TCP connection to speed up future requests to the server, but this obviously causes problems for single threaded servers. If you're running Dancer's default dev server (HTTP::Server::Simple::PSGI), you won't be able to handle IE9.
I worked around it by proxying from Apache. It makes dev a little more of a hassle, but only when I have to test IE9.

How to trace per-file IO operations in Linux?

I need to track read system calls for specific files, and I'm currently doing this by parsing the output of strace. Since read operates on file descriptors I have to keep track of the current mapping between fd and path. Additionally, seek has to be monitored to keep the current position up-to-date in the trace.
Is there a better way to get per-application, per-file-path IO traces in Linux?
You could wait for the files to be opened so you can learn the fd and attach strace after the process launch like this:
strace -p pid -e trace=file -e read=fd
First, you probably don't need to keep track because mapping between fd and path is available in /proc/PID/fd/.
Second, maybe you should use the LD_PRELOAD trick and overload in C open, seek and read system call. There are some article here and there about how to overload malloc/free.
I guess it won't be too different to apply the same kind of trick for those system calls. It needs to be implemented in C, but it should take far less code and be more precise than parsing strace output.
systemtap - a kind of DTrace reimplementation for Linux - could be of help here.
As with strace you only have the fd, but with the scripting ability it is easy to maintain the filename for an fd (unless with fun stuff like dup). There is the example script iotime that illustates it.
#! /usr/bin/env stap
/*
* Copyright (C) 2006-2007 Red Hat Inc.
*
* This copyrighted material is made available to anyone wishing to use,
* modify, copy, or redistribute it subject to the terms and conditions
* of the GNU General Public License v.2.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*
* Print out the amount of time spent in the read and write systemcall
* when each file opened by the process is closed. Note that the systemtap
* script needs to be running before the open operations occur for
* the script to record data.
*
* This script could be used to to find out which files are slow to load
* on a machine. e.g.
*
* stap iotime.stp -c 'firefox'
*
* Output format is:
* timestamp pid (executabable) info_type path ...
*
* 200283135 2573 (cupsd) access /etc/printcap read: 0 write: 7063
* 200283143 2573 (cupsd) iotime /etc/printcap time: 69
*
*/
global start
global time_io
function timestamp:long() { return gettimeofday_us() - start }
function proc:string() { return sprintf("%d (%s)", pid(), execname()) }
probe begin { start = gettimeofday_us() }
global filehandles, fileread, filewrite
probe syscall.open.return {
filename = user_string($filename)
if ($return != -1) {
filehandles[pid(), $return] = filename
} else {
printf("%d %s access %s fail\n", timestamp(), proc(), filename)
}
}
probe syscall.read.return {
p = pid()
fd = $fd
bytes = $return
time = gettimeofday_us() - #entry(gettimeofday_us())
if (bytes > 0)
fileread[p, fd] += bytes
time_io[p, fd] <<< time
}
probe syscall.write.return {
p = pid()
fd = $fd
bytes = $return
time = gettimeofday_us() - #entry(gettimeofday_us())
if (bytes > 0)
filewrite[p, fd] += bytes
time_io[p, fd] <<< time
}
probe syscall.close {
if ([pid(), $fd] in filehandles) {
printf("%d %s access %s read: %d write: %d\n",
timestamp(), proc(), filehandles[pid(), $fd],
fileread[pid(), $fd], filewrite[pid(), $fd])
if (#count(time_io[pid(), $fd]))
printf("%d %s iotime %s time: %d\n", timestamp(), proc(),
filehandles[pid(), $fd], #sum(time_io[pid(), $fd]))
}
delete fileread[pid(), $fd]
delete filewrite[pid(), $fd]
delete filehandles[pid(), $fd]
delete time_io[pid(),$fd]
}
It only works up to a certain number of files because the hash map is size limited.
strace now has new options to track file descriptors:
--decode-fds=set
Decode various information associated with file descriptors. The default is decode-fds=none. set can include the following elements:
path Print file paths.
socket Print socket protocol-specific information,
dev Print character/block device numbers.
pidfd Print PIDs associated with pidfd file descriptors.
This is useful as file descriptors are reused after being closed, and /proc/$PID/fd only provides one snapshot in time, which is useless when debugging something in realtime.
Sample output, note how file names are displayed in angular brackets and FD 3 is reused for all of /etc/ld.so.cache, /lib/x86_64-linux-gnu/libc.so.6, /usr/lib/locale/locale-archive, /home/florian/hello.
$ strace -e trace=desc --decode-fds=all cat hello 1>/dev/null
execve("/usr/bin/cat", ["cat", "hello"], 0x7fff42e20710 /* 102 vars */) = 0
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3</etc/ld.so.cache>
newfstatat(3</etc/ld.so.cache>, "", {st_mode=S_IFREG|0644, st_size=167234, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 167234, PROT_READ, MAP_PRIVATE, 3</etc/ld.so.cache>, 0) = 0x7f22edeee000
close(3</etc/ld.so.cache>) = 0
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3</usr/lib/x86_64-linux-gnu/libc-2.33.so>
read(3</usr/lib/x86_64-linux-gnu/libc-2.33.so>, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240\206\2\0\0\0\0\0"..., 832) = 832
pread64(3</usr/lib/x86_64-linux-gnu/libc-2.33.so>, "\6\0\0\0\4\0\0\0#\0\0\0\0\0\0\0#\0\0\0\0\0\0\0#\0\0\0\0\0\0\0"..., 784, 64) = 784
pread64(3</usr/lib/x86_64-linux-gnu/libc-2.33.so>, "\4\0\0\0 \0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0"..., 48, 848) = 48
pread64(3</usr/lib/x86_64-linux-gnu/libc-2.33.so>, "\4\0\0\0\24\0\0\0\3\0\0\0GNU\0+H)\227\201T\214\233\304R\352\306\3379\220%"..., 68, 896) = 68
newfstatat(3</usr/lib/x86_64-linux-gnu/libc-2.33.so>, "", {st_mode=S_IFREG|0755, st_size=1983576, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f22edeec000
pread64(3</usr/lib/x86_64-linux-gnu/libc-2.33.so>, "\6\0\0\0\4\0\0\0#\0\0\0\0\0\0\0#\0\0\0\0\0\0\0#\0\0\0\0\0\0\0"..., 784, 64) = 784
mmap(NULL, 2012056, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3</usr/lib/x86_64-linux-gnu/libc-2.33.so>, 0) = 0x7f22edd00000
mmap(0x7f22edd26000, 1486848, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3</usr/lib/x86_64-linux-gnu/libc-2.33.so>, 0x26000) = 0x7f22edd26000
mmap(0x7f22ede91000, 311296, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3</usr/lib/x86_64-linux-gnu/libc-2.33.so>, 0x191000) = 0x7f22ede91000
mmap(0x7f22ededd000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3</usr/lib/x86_64-linux-gnu/libc-2.33.so>, 0x1dc000) = 0x7f22ededd000
mmap(0x7f22edee3000, 33688, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f22edee3000
close(3</usr/lib/x86_64-linux-gnu/libc-2.33.so>) = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f22edcfe000
openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3</usr/lib/locale/locale-archive>
newfstatat(3</usr/lib/locale/locale-archive>, "", {st_mode=S_IFREG|0644, st_size=6055600, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 6055600, PROT_READ, MAP_PRIVATE, 3</usr/lib/locale/locale-archive>, 0) = 0x7f22ed737000
close(3</usr/lib/locale/locale-archive>) = 0
fstat(1</dev/null<char 1:3>>, {st_mode=S_IFCHR|0666, st_rdev=makedev(0x1, 0x3), ...}) = 0
openat(AT_FDCWD, "hello", O_RDONLY) = 3</home/florian/hello>
fstat(3</home/florian/hello>, {st_mode=S_IFREG|0664, st_size=6, ...}) = 0
fadvise64(3</home/florian/hello>, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
mmap(NULL, 139264, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f22edef5000
read(3</home/florian/hello>, "world\n", 131072) = 6
write(1</dev/null<char 1:3>>, "world\n", 6) = 6
read(3</home/florian/hello>, "", 131072) = 0
close(3</home/florian/hello>) = 0
close(1</dev/null<char 1:3>>) = 0
close(2</dev/pts/5<char 136:5>>) = 0
+++ exited with 0 +++
I think overloading open, seek and read is a good solution. But just FYI if you want to parse and analyze the strace output programmatically, I did something similar before and put my code in github: https://github.com/johnlcf/Stana/wiki
(I did that because I have to analyze the strace result of program ran by others, which is not easy to ask them to do LD_PRELOAD.)
Probably the least ugly way to do this is to use fanotify. Fanotify is a Linux kernel facility that allows cheaply watching filesystem events. I'm not sure if it allows filtering by PID, but it does pass the PID to your program so you can check if it's the one you're interested in.
Here's a nice code sample:
http://bazaar.launchpad.net/~pitti/fatrace/trunk/view/head:/fatrace.c
However, it seems to be under-documented at the moment. All the docs I could find are http://www.spinics.net/lists/linux-man/msg02302.html and http://lkml.indiana.edu/hypermail/linux/kernel/0811.1/01668.html
Parsing command-line utils like strace is cumbersome; you could use ptrace() syscall instead. See man ptrace for details.

Resources