What can I assume about pthread_create and file descriptors? - linux

I just debugged a program that did roughly:
pthread_create(...);
close(0);
int t = open("/named_pipe");
assert(t == 0);
Occasionally it fails, as pthread_create actually briefly opens file descriptors on the new thread – specifically /sys/devices/system/cpu/online – which if you're unlucky occur between the close and open above, making t something other than 0.
What's the safest way to do this? What if anything is guaranteed about pthread_create regarding file descriptors? Am I guaranteed that if there are 3 file descriptors open before I call pthread_create, then there'll also be 3 open when it's returned and control has been passed to my function on the new thread?

In multi-threaded programs, you need to use dup2 or dup3 to replace file descriptors. The old trick with immediate reuse after close no longer works because other threads and create file descriptors at any time. Such file descriptors can even be created (and closed) implicitly by glibc because many kernel interfaces use file descriptors.
dup2 is the standard interface. Linux also has dup3, with which you can atomically create the file descriptor with the O_CLOEXEC flag set. Otherwise, there would still be a race condition, and the descriptor could leak to a subprocess if the process ever forks and executes a new program.

Related

Number of watched file descriptors inside a epoll

I am looking for a way to check the current number of file descriptors being monitored by an epoll instance. I use the following for creating and populating the epoll instance
epoll_create
epoll_ctl
Platform is Gnu/Linux.
As far as I know, there is no system call available which can provide the count of file descriptors which are getting monitored by epoll. You can achieve this by maintaining one counter variable. Increment/decrement this variable after successfully adding/removing the file descriptor to the epoll using epoll_ctl().

multi-threaded file locking in linux

I'm working on a multithreaded application where multiple threads may want exclusive access to the same file. I'm looking for a way of serializing these operations. I was planning to use flock, lockf, or fcntl locking. However it appears that with these methods an attempt to lock a file by a second thread when a first thread already owns the lock will be granted, because the two threads are in the same process. This is according to the manpages for flock and fnctl (and I guess in linux lockf is implemented with fnctl). Also supported by this other question. So, are there other ways of locking a file in linux which works at a thread-level instead of a process-level?
Some alternatives that I came up with which I do not like are:
1) Use a lockfile (xxx.lock) opened with O_CREAT | O_EXCL flags. This call will succeed only in one thread if there is contention. The problem with this is that then other threads have to spin on the call until they achieve the lock, meaning that I have to _yield() or sleep() which makes me think this is not a great option.
2) Keep a mutex'ed list of all open files. When a thread wants to open/close a file it has to lock the list first. When opening a file, it searches the list to see if it's open. This sounds particularly inefficient because it requires a significant amount of work even if the file is not owned yet.
Are there other ways of doing this?
Edit:
I just discovered this text in my system's manpages which isn't in the online man pages:
If a process uses open(2) (or similar) to obtain more than one descriptor for the same file, these descriptors are treated independently by flock(). An attempt to lock the file using one of these file descriptors may be denied by a lock that the calling process has already placed via another descriptor.
I'm not happy about the words "may be denied", I'd prefer "will be denied" but I guess it's time to test that.

When does write() to a file return EWOULDBLOCK?

I want to append data often to a file on the local filesystem. I want to do this without blocking for too long, and without making any worker threads. On Linux kernel 2.6.18.
It seems that the POSIX AIO implementation for glibc on Linux makes a userspace threadpool and blocks those threads. Which is cool, but I could just as easily spin off my own special dedicated file blocking thread.
http://www.kernel.org/doc/man-pages/online/pages/man7/aio.7.html
And it's my understanding that the Linux Kernel AIO implementation currently blocks on append. Appending is the only thing I want to do.
http://code.google.com/p/kernel/wiki/AIOUserGuide
https://groups.google.com/forum/#!msg/linux.kernel/rFOD6mR2n30/13VDXRTmBCgJ
I'm considering opening the file with O_NONBLOCK, and then doing a kind of lazy writing where if it EWOULDBLOCK, then try the write again later. Something like this:
open(pathname, O_CREAT | O_APPEND | O_NONBLOCK);
call write(), check for error EAGAIN | EWOULDBLOCK
if EAGAIN | EWOULDBLOCK, then just save the data to be written and try the write() again later.
Is this a good idea? Is there any actual advantage to this? If I'm the only one with an open file descriptor to that file, and I try a write() and it EWOULDBLOCK, then is it any less likely to EWOULDBLOCK later? Will it ever EWOULDBLOCK? If I write() and it doesn't EWOULDBLOCK, does that mean write() will return swiftly?
In other words, under exactly what circumstances, if any, will write() to a local file fail with EWOULDBLOCK on Linux 2.6.18?
I'm not sure about local file system, but I'm pretty sure that you can get EWOULDBLOCK when trying to write to a file on a mounted file system (e.g. nfs). The problem here is that normally you don't know if it is really "local" hard disk unless you spefically check for this every time you create/open the file. How to check this is of course system dependent.
Even if system creates some additional thread to do the actual write, this thread would have a buffer (which would not be infinite), so if you write fast enough, you could get EWOULDBLOCK.
under ... what circumstances ... will write() to a local file fail with EWOULDBLOCK
Perhaps there is no circumstance for a file. The Linux man page for write(2) states that EWOULDBLOCK would only be returned for a file descriptor that refers to a socket.
EAGAIN or EWOULDBLOCK
The file descriptor fd refers to a socket and has been marked nonblocking (O_NONBLOCK), and the write would block. POSIX.1-2001 allows either error to be returned for this case, and does not require these constants to have the same value, so a portable application should check for both possibilities.
Apparently this behavior is related to the fact that a socket would employ record locks, whereas a simple file would not.

Open file in kthread on behalf of a user process

I am writing a linux kernel module which would start a kthread when a user process calls to it (using ioctl).
How can i open a file using this kthread on bahalf of user process, that is, when it returns the user process can access this file itself!?
It's not really sensible to do this. To open a file that the userspace process can read, you need to return a file descriptor to that process.
Potentially you could return a UNIX-domain socketpair connecting the kernel thread to the userspace thread, and have the kernel thread pass open file descriptors across that socket using a SCM_RIGHTS message.
It is likely to be more appropriate, however, to simply open the file in the context of the original process in the ioctl() call and return the file descriptor there.

simultaneous read on file descriptor from two threads

my question: in Linux (and in FreeBsd, and generally in UNIX) is it possible/legal to read single file descriptor simultaneously from two threads?
I did some search but found nothing, although a lot of people ask like question about reading/writing from/to socket fd at the same time (meaning reading when other thread is writing, not reading when other is reading). I also have read some man pages and got no clear answer on my question.
Why I ask it. I tried to implement simple program that counts lines in stdin, like wc -l. I actually was testing my home-made C++ io engine for overhead, and discovered that wc is 1.7 times faster. I trimmed down some C++ and came closer to wc speed but didn't reach it. Then I experimented with input buffer size, optimized it, but still wc is clearly a bit faster. Finally I created 2 threads which read same STDIN_FILENO in parallel, and this at last was faster than wc! But lines count became incorrect... so I suppose some junk comes from reads which is unexpected. Doesn't kernel care what process read?
Edit: I did some research and discovered just that calling read directly via syscall does not change anything. Kernel code seem to do some sync handling, but i didnt understand much (read_write.c)
That's undefined behavior, POSIX
says:
The read() function shall attempt to read nbyte bytes from the file
associated with the open file descriptor, fildes, into the buffer
pointed to by buf. The behavior of multiple concurrent reads on the
same pipe, FIFO, or terminal device is unspecified.
About accessing a single file descriptor concurrently (i.e. from multiple threads or even processes), I'm going to cite POSIX.1-2008 (IEEE Std 1003.1-2008), Subsection 2.9.7 Thread Interactions with Regular File Operations:
2.9.7 Thread Interactions with Regular File Operations
All of the following functions shall be atomic with respect to each other in the effects specified in POSIX.1-2008 when they operate on regular files or symbolic links:
[…] read() […]
If two threads each call one of these functions, each call shall either see all of the specified effects of the other call, or none of them. […]
At first glance, this looks quite good. However, I hope you did not miss the restriction when they operate on regular files or symbolic links.
#jarero cites:
The behavior of multiple concurrent reads on the same pipe, FIFO, or terminal device is unspecified.
So, implicitly, we're agreeing, I assume: It depends on the type of the file you are reading. You said, you read from STDIN. Well, if your STDIN is a plain file, you can use concurrent access. Otherwise you shouldn't.
When used with a descriptor (fd), read() and write() rely on the internal state of the fd to know the "current offset" at which the read and write will occur. As a result, they aren't thread-safe.
To allow a single descriptor to be used by multiple threads simultaneously, pread() and pwrite() are provided. With those interfaces, the descriptor and the desired offset are specified, so the "current offset" in the descriptor isn't used.

Resources