core dump files on Linux: how to get info on opened files? - linux

I have a core dump file from a process that has probably a file descriptor leak (it opens files and sockets but apparently sometimes forgets to close some of them). Is there a way to find out which files and sockets the process had opened before crashing? I can't easily reproduce the crash, so analyzing the core file seems to be the only way to get a hint on the bug.

If you have a core file and you have compiled the program with debugging options (-g), you can see where the core was dumped:
$ gcc -g -o something something.c
$ ./something
Segmentation fault (core dumped)
$ gdb something core
You can use this to do some post-morten debugging. A few gdb commands: bt prints the stack, fr jumps to given stack frame (see the output of bt).
Now if you want to see which files are opened at a segmentation fault, just handle the SIGSEGV signal, and in the handler, just dump the contents of the /proc/PID/fd directory (i.e. with system('ls -l /proc/PID/fs') or execv).
With these information at hand you can easily find what caused the crash, which files are opened and if the crash and the file descriptor leak are connected.

Your best bet is to install a signal handler for whatever signal is crashing your program (SIGSEGV, etc.).
Then, in the signal handler, inspect /proc/self/fd, and save the contents to a file. Here is a sample of what you might see:
Anderson cxc # ls -l /proc/8247/fd
total 0
lrwx------ 1 root root 64 Sep 12 06:05 0 -> /dev/pts/0
lrwx------ 1 root root 64 Sep 12 06:05 1 -> /dev/pts/0
lrwx------ 1 root root 64 Sep 12 06:05 10 -> anon_inode:[eventpoll]
lrwx------ 1 root root 64 Sep 12 06:05 11 -> socket:[124061]
lrwx------ 1 root root 64 Sep 12 06:05 12 -> socket:[124063]
lrwx------ 1 root root 64 Sep 12 06:05 13 -> socket:[124064]
lrwx------ 1 root root 64 Sep 12 06:05 14 -> /dev/driver0
lr-x------ 1 root root 64 Sep 12 06:05 16 -> /temp/app/whatever.tar.gz
lr-x------ 1 root root 64 Sep 12 06:05 17 -> /dev/urandom
Then you can return from your signal handler, and you should get a core dump as usual.

One of the ways I jump to this information is just running strings on the core file. For instance, when I was running file on a core recently, due to the length of the folders I would get a truncated arguments list. I knew my run would have opened files from my home directory, so I just ran:
strings core.14930|grep jodie
But this is a case where I had a needle and a haystack.

If the program forgot to close those resources it might be because something like the following happened:
fd = open("/tmp/foo",O_CREAT);
//do stuff
fd = open("/tmp/bar",O_CREAT); //Oops, forgot to close(fd)
now I won't have the file descriptor for foo in memory.
If this didn't happen, you might be able to find the file descriptor number, but then again, that is not very useful because they are continuously changing, by the time you get to debug you won't know which file it actually meant at the time.
I really think you should debug this live, with strace, lsof and friends.
If there is a way to do it from the core dump, I'm eager to know it too :-)

You can try using strace to see the open, socket and close calls the program makes.
Edit: I don't think you can get the information from the core; at most it will have the file descriptors somewhere, but this still doesn't give you the actual file/socket. (Assuming you can distinguish open from closed file descriptors, which I also doubt.)

Recently during my error troubleshooting and analysis , my customer provided me a coredump which got generated in his filesystem and he went out of station in order to quickly scan through the file and read its contents i used the command
strings core.67545 > coredump.txt
and later i was able to open the file in file editor.

A core dump is a copy of the memory the process had access to when crashed. Depending on how the leak is occurring, it might have lost the reference to the handles, so it may prove to be useless.
lsof lists all currently open files in the system, you could check its output to find leaked sockets or files. Yes, you'd need to have the process running. You could run it with a specific username to easily discern which are the open files from the process you are debugging.
I hope somebody else has better information :-)

Another way to find out what files a process has opened - again, only during runtime - is looking into /proc/PID/fd/ , which contains symlinks to open files.

Related

socket fd from /proc/$PID/fd seems invalid

I know pid of the process, and I need to obtain socket fd used by it, so I look for it in /proc/$pid/fd, for instance:
$ ls -la /proc/1442/fd | grep socket
lrwx------ 1 root root 64 Jan 23 16:22 7 -> socket:[21807]
$
Now, when I pass the value 7 representing socket descriptor to getsockopt() I'm getting EBADF error. Is it not allowed to do this from another process even with root privileges?
What am I doing wrong?
File descriptors are per-process. They are not shared between processes.
If you want to access a file descriptor owned by another process, you can sometimes open() the path in /proc/<pid>/fd to get a copy of it. However, this only works on normal files; it doesn't work on sockets. (This question addresses the internal details.)
So, in short, no. There's no straightforward way I'm aware of for one process to "take over" a socket from another process, without that process's cooperation.
It seems you can "takeover" a socket being root, look:

Empty lxc core dump

Good day.
I cannot achieve getting core dump file of any process launched from lxc container.
Here are my settings (inside container):
$ cat /proc/sys/kernel/core_pattern
/var/crash/coredump-%e.%p
$ ulimit -c
unlimited
$ ls -lha /var/crash/
drwxrwxrwx 2 root root 13 Oct 14 23:34 .
...
I start a C-written program that causes core dump inside container:
$ ./c
Segmentation fault
And don't get a message "core dumped" (unlike in base system, outside container everything is just perfect).
But a new file is generated in core dump path and it has a zero size:
$ ls -lha /var/crash/*
-rw------- 1 root root 0 Oct 14 23:43 /var/crash/coredump-c.1866
This behavior is similar for php5-fpm process, it also generates a zero-sized core dump when crashes, yet the limit for the pool is set to "unlimited" in pool settings.
Is there anything I've missed? Exploring the Internet for something like "limits lxc" or "core dump lxc" didn't get me anything relevant. Thanks!

How to see which file are in use in Linux

I have a question how can I see which file are in use in linux. To be honest this OS is not the normal version of linux it is very crippled so for example there is no command like "lsof". I found command "strace" but this is no what I looking for. I hear that I can list this file with hacking kernel?
I want to see which file are in use because on this machine is a little free space and I want to delete file which are no in use when the program is running.
I'm sorry for my weak english.
You can inspect the open files by process by walking the /proc virtual filesystem
Every process running has an entry in /proc/PID. There's a directory in each process directory called 'fd', that represents the processes currently opened file descriptors. These appear as links to the actual resources.
e.g. on a VM I have running
root#wvm:/proc/1213/fd# pidof dhclient
1213
root#wvm:/proc/1213/fd# cd /proc/1213/fd
root#wvm:/proc/1213/fd# ls -l
total 0
lrwx------ 1 root root 64 Apr 8 09:11 0 -> /dev/null
lrwx------ 1 root root 64 Apr 8 09:11 1 -> /dev/null
lrwx------ 1 root root 64 Apr 8 09:11 2 -> /dev/null
lrwx------ 1 root root 64 Apr 8 09:11 20 -> socket:[4687]
lrwx------ 1 root root 64 Apr 8 09:11 21 -> socket:[4688]
l-wx------ 1 root root 64 Apr 8 09:11 3 -> /var/lib/dhcp/dhclient.eth0.leases
lrwx------ 1 root root 64 Apr 8 09:11 4 -> socket:[4694]
lrwx------ 1 root root 64 Apr 8 09:11 5 -> socket:[4698]
root#wvm:/proc/1213/fd#
Looking at the kernel process information for 'dhclient' - I find its pid, and then look in the fd subdirectory for this process id. It has a small set of open descriptors - stdin, stdout and stderr ( 0,1,2 ) have all been attached to /dev/null , there's four sockets open, but file descriptor 3 is attached to a data file /var/lib/dhcp/dhclient.eth0.leases
So you could duplicate the functionality of lsof using shell tools to walk /proc and filter out the file names from these links.
Are you able to use "top" command? if so then this should show you the list of all the top OS utilizing operations running on linux. You can do
ps -ef|grep <process_no>
this would give the details of it. Either can stop that process or kill it using
kill -9 <process no>
Use the lsof command to list all open files by process.

Some high level questions about how PAM is designed

I'm creating a PAM module for a project. The PAM module will be using a library that will be re-used by some command line utilities (rather than re-writing everything each time). In this library, I want to have it interpret policy that discriminate against and/or logs according to subnet memberships of the remote host. Near as I can tell this value is probably coming from the authenticating application, but I don't know. Since the shared object won't have access to the pamh structure from libpam I can't just do a pam_get_item (like I would be able to from the PAM module itself) so I've had to resort to other means.
The best solution I've come up with is to have the shared object look for a connected TTY, if it's there go to utmp and find the login process associated with that TTY, extract the IP address from there. If there isn't a TTY, assume it's an initial login of a network user. The library then iterates over the sockets (which I've defined as basically any symlink with the word "socket" in the target's filename when you do a ls -l /proc/<pid>/fd) and uses the socket inode number to cross reference with /proc/net/tcp and extracts the remote IP address associated with that inode number. If it doesn't find an inode there then it assumes it's Unix domain or tcp6 (IPv6 support in this is forthcoming and not terribly important for the near future). If it still isn't able to find it, assume that some daemon has called an application linking against it and interpret it as such (might do something eventually, if it's worthwhile, but for now it's just a big NOOP if the first two don't return anything.
It seems to work but I have some high level questions about how PAM is supposed to work:
Is there some official standard that governs PAM operation? For example, is it covered by a POSIX standard somewhere? I know there are multiple PAM implementations (four or five that I've found thusfar) but I don't know if existing commonalities are de jure or de facto or just how I happen to have my system configured.
After I did a ls -l /proc/<pid>/fd > /lsOutput from the module itself (via system()):
[root#hypervisor pam]# cat /lsOutput total 0
lrwx------. 1 root root 64 Jun 15 15:09 0 -> /dev/null
lrwx------. 1 root root 64 Jun 15 15:09 1 -> /dev/null
lrwx------. 1 root root 64 Jun 15 15:09 2 -> /dev/null
lr-x------. 1 root root 64 Jun 15 15:09 3 -> socket:[426180]
[root#hypervisor pam]#
And issuing a manual ls after the user logins in:
[root#hypervisor pam]# ls -l /proc/18261/fd
total 0
lrwx------. 1 root root 64 Jun 15 15:15 0 -> /dev/null
lrwx------. 1 root root 64 Jun 15 15:15 1 -> /dev/null
lrwx------. 1 root root 64 Jun 15 15:15 11 -> /dev/ptmx
lrwx------. 1 root root 64 Jun 15 15:15 12 -> /dev/ptmx
lrwx------. 1 root root 64 Jun 15 15:15 13 -> socket:[426780]
lrwx------. 1 root root 64 Jun 15 15:15 14 -> socket:[426829]
lrwx------. 1 root root 64 Jun 15 15:15 2 -> /dev/null
lrwx------. 1 root root 64 Jun 15 15:15 3 -> socket:[426180]
lrwx------. 1 root root 64 Jun 15 15:15 4 -> socket:[426322]
lr-x------. 1 root root 64 Jun 15 15:15 5 -> pipe:[426336]
l-wx------. 1 root root 64 Jun 15 15:15 6 -> pipe:[426336]
lrwx------. 1 root root 64 Jun 15 15:15 7 -> socket:[426348]
lrwx------. 1 root root 64 Jun 15 15:15 8 -> socket:[426349]
lrwx------. 1 root root 64 Jun 15 15:15 9 -> /dev/ptmx
[root#hypervisor pam]#
So basically, it seems like both the TTY and any additional sockets get opened only AFTER the session modules finish (my temporary test module's session handling is the last in the stack for the sshd service). I've been unable to get it to be otherwise (or even think of a time when the connecting client won't be a TCP socket at descriptor 3).
Is this just due to my lack of imagination or is it necessarily so? I'm leaning towards the latter as it would seem that communicating with the client would be a pre-requisite to doing pretty much anything else that's useful. I don't know that for sure, so I feel I should ask somebody. Will descriptor 3 always be the authenticating client (my .so only makes the assumption that it's the lowest numbered TCP socket, and only if there's no TTY, but it seems like 3 should always be the descriptor for the connecting client). Would pulling the first TCP descriptor be a "deterministic" way of establishing the remote client's identity? Or is there no prescribed way this is supposed to play out and that's just how either my system is configured or how SSH has chosen to interface with PAM?
Is it sshd that's setting the rhost value or is that coming from some place else? I've tried grep-ing over the source code for both SSH and libpam, but no dice. I can see where libpam handles the setting of the host value when something call pam_set_item, but not were pam_set_item actually gets called to set it to be this or that particular host.
Any amount of help would be appreciated, I've googled but I'm starting to get splinters on my fingertips from scraping the bottom of the barrel.
Main reason I'm interested in knowing this is so that I'll end up not only with the "right" answer but mostly so that I won't have any surprises later on down the road. We have some Solaris platforms we may do this on, but my main motivation is to have assumptions that are grounded in things that are actually going to be constant.
I also realize that I could have the client programs/modules feed the host information to the library, but that would likely involve code re-write two or three times (as the CLI tools prepare session information from utmp and the PAM module from pam_get_item) and potentially make the project look more complex than it really needs to be.
Answering some of your questions:
"Is there some official standard that governs PAM operation?"
Apparently, yes. Wikipedia's entry on Pluggable_Authentication_Modulesays "PAM was standardized as part of the X/Open UNIX standardization process, resulting in the X/Open Single Sign-on (XSSO) standard." I have never found this particularly relevant to my dealings with it.
"Is this just due to my lack of imagination or is it necessarily so?"
<magicEightBall>"Concentrate and ask again"</magicEightBall> (It's ambiguous which "this" is being referred to - perhaps you can clarify?
"Will descriptor 3 always be the authenticating client?"
This is a behaviour of the application, rather than PAM.
Would pulling the first TCP descriptor be a "deterministic" way of establishing the remote client's identity?
Also a behaviour of the application.
"Is it sshd that's setting the rhost value or is that coming from some place else?"
It is sshd that sets the rhost value. In openssh's file auth-pam.c, function sshpam_init(), you'll find:
sshpam_err = pam_set_item(sshpam_handle, PAM_RHOST, pam_rhost);
Some general notes:
rather than keying off the TTY - which, as you've discovered get set later - you can walk the process lineage via getppid() and /proc.
don't trust system() in PAM modules - it uses a user's environment, and may not be the one you expect. Use fork()/execlp() instead.

Less gets keyboard input from stderr?

I'm taking a look at the code to the 'less' utility, specifically how it gets keyboard input. Interestingly, on line 80 of ttyin.c, it sets the file descriptor to read from:
/*
* Try /dev/tty.
* If that doesn't work, use file descriptor 2,
* which in Unix is usually attached to the screen,
* but also usually lets you read from the keyboard.
*/
#if OS2
/* The __open() system call translates "/dev/tty" to "con". */
tty = __open("/dev/tty", OPEN_READ);
#else
tty = open("/dev/tty", OPEN_READ);
#endif
if (tty < 0)
tty = 2;
Isn't file descriptor 2 stderr? If so, WTH?! I thought keyboard input was sent through stdin.
Interestingly, even if you do ls -l * | less, after the file finishes loading, you can still use the keyboard to scroll up and down, but if you do ls -l * | vi, then vi will yell at you because it doesn't read from stdin. What's the big idea? How did I end up in this strange new land where stderr is both a way to report errors to the screen and read from the keyboard? I don't think I'm in Kansas anymore...
$ ls -l /dev/fd/
lrwx------ 1 me me 64 2009-09-17 16:52 0 -> /dev/pts/4
lrwx------ 1 me me 64 2009-09-17 16:52 1 -> /dev/pts/4
lrwx------ 1 me me 64 2009-09-17 16:52 2 -> /dev/pts/4
When logged in at an interative terminal, all three standard file descriptors point to the same thing: your TTY (or pseudo-TTY).
$ ls -fl /dev/std{in,out,err}
lrwxrwxrwx 1 root root 4 2009-09-13 01:57 /dev/stdin -> fd/0
lrwxrwxrwx 1 root root 4 2009-09-13 01:57 /dev/stdout -> fd/1
lrwxrwxrwx 1 root root 4 2009-09-13 01:57 /dev/stderr -> fd/2
By convention, we read from 0 and write to 1 and 2. However, nothing prevents us from doing otherwise.
When your shell runs ls -l * | less, it creates a pipe from ls's file descriptor 1 to less's file descriptor 0. Obviously, less can no longer read the user's keyboard input from file descriptor 0 – it tries to get the TTY back however it can.
If less has not been detached from the terminal, open("/dev/tty") will give it the TTY.
However, in case that fails... what can you do? less makes one last attempt at getting the TTY, assuming that file descriptor 2 is attached to the same thing that file descriptor 0 would be attached to, if it weren't redirected.
This is not failproof:
$ ls -l * | setsid less 2>/dev/null
Here, less is given its own session (so it is no longer a part of the terminal's active process group, causing open("/dev/tty") to fail), and its file descriptor 2 has been changed – now less exits immediately, because it is outputting to a TTY yet it fails to get any user input.
Well... first off, you seem to missing the open() call which opens '/dev/tty'. It only uses file descriptor 2 if the call to open() fails. On a standard Linux system, and probably many Unices, '/dev/tty' exists and is unlikely to cause a fail.
Secondly, the comment at the top provides a limited amount of explanation as to why they fall back to file descriptor 2. My guess is that stdin, stdout, and stderr are pretty much connected to '/dev/tty/' anyway, unless redirected. And since the most common redirections for for stdin and/ or stdout (via piping or < / >), but less often for stderr, odds on are that using stderr would be most likely to still be connect to the "keyboard".
The same question with an answer ultimately from the person who asked it is on linuxquestions although they quote slightly different source from less. And no, I don't understand most of it so I can't help beyond that :)
It appears to be Linux specific functionality that sends keyboard input to FD 2.

Resources