when I do
ls proc/[pid]/fd
sometimes I don't get output, It seems that there are no file descriptor in that file.
What does that mean when a process doesn't have file descriptor?
The process in question is more likely a deamon — daemon processed will intentionally close standard file descriptors in order to avoid hanging onto their resources. (They will also chdir to the root directory, invoke an extra fork() and perform a number of more obscure operations for the same reason.)
Related
I've read that you can pass a file descriptor to another process there, which seems perfect for what I want. Any chance that's do-able in Haskell in any way ?
To be clear, I'm not forking and I can't pre-open the file, I actually need a way to pass a file descriptor (mainly stdin) from a bunch of processes to a daemon, to avoid having to keep processes up just to forward their input, that'd fill the process list quite fast and would probably eat ressources for no reason.
Thanks !
You can get the file descriptor of STDIN from the unix package and UNIX-domain sockets from network.
I've never tried passing a file descriptor between processes, but it should work the same in Haskell as any other language.
I just debugged a program that did roughly:
pthread_create(...);
close(0);
int t = open("/named_pipe");
assert(t == 0);
Occasionally it fails, as pthread_create actually briefly opens file descriptors on the new thread – specifically /sys/devices/system/cpu/online – which if you're unlucky occur between the close and open above, making t something other than 0.
What's the safest way to do this? What if anything is guaranteed about pthread_create regarding file descriptors? Am I guaranteed that if there are 3 file descriptors open before I call pthread_create, then there'll also be 3 open when it's returned and control has been passed to my function on the new thread?
In multi-threaded programs, you need to use dup2 or dup3 to replace file descriptors. The old trick with immediate reuse after close no longer works because other threads and create file descriptors at any time. Such file descriptors can even be created (and closed) implicitly by glibc because many kernel interfaces use file descriptors.
dup2 is the standard interface. Linux also has dup3, with which you can atomically create the file descriptor with the O_CLOEXEC flag set. Otherwise, there would still be a race condition, and the descriptor could leak to a subprocess if the process ever forks and executes a new program.
A process running on Linux writes some data to a file on the file system, and then invokes close(). Immediately afterwards, another process invokes open() and reads from the file.
Is the second process always 100% guaranteed to see an updated file?
What happens when using a network filesystem, and the two processes run on the same host?
What happens when the two processes are on different hosts?
close() does not guarantee that the data is written to disk, you have to use fsync() for this. See Does Linux guarantee the contents of a file is flushed to disc after close()? for a similar question.
While stracing some linux daemons (eg. sendmail) I noticed that some of them will call close() on a number of descriptors (usually ranging from 3 to 255) right at the beginning. Is this being done on purpose or is this some sort of a side effect of doing something else?
It is usually done as part of making a process a daemon.
All file descriptors are closed so that the long-running daemon does not unnecessarily hold any resources. For example, if a daemon were to inherit an open file and the daemon did not close it then the file could not be deleted (the storage for it would remain allocated until close) and the filesystem that the file is on could not be unmounted.
Daemonizing a process will also take a number of other actions, but those actions are beyond the scope of this question.
If one of my processes open a file, let's say for reading only, does the O.S guarantee that no other process will write on it as I'm reading, maybe
leaving the reading process with first part of the old file version, and second part of the newer file version, making data integrity questionable?
I am not talking about pipes which have no seek, but on regular files, with seek option (at least when opened with only one process).
No, other processes can change the file contents as you are reading it. Try running "man fcntl" and ignore the section on "advisory" locks; those are "optional" locks that processes only have to pay attention to if they want. Instead, look for the (alas, non-POSIX) "mandatory" locks. Those are the ones that will protect you from other programs. Try a read lock.
No, if you open a file, other processes can write to it, unless you use a lock.
On Linux, you can add an advisory lock on a file with:
#include <sys/file.h>
...
flock(file_descriptor,LOCK_EX); // apply an advisory exclusive lock
Any process which can open the file for writing, may write to it. Writes can happen concurrently with your own writes, resulting in (potentially) indeterminate states.
It is your responsibility as an application writer to ensure that Bad Things don't happen. In my opinion mandatory locking is not a good idea.
A better idea is not to grant write access to processes which you don't want to write to the file.
If several processes open a file, they will have independent file pointers, so they can seek() and not affect one another.
If a file is opened by a threaded program (or a task which shares its file descriptors with another, more generally), the file pointer is also shared, so you need to use another method to access the file to avoid race conditions causing chaos - normally pread, pwrite, or the scatter/gather functions readv and writev.