Prevent fork() from copying sockets - linux

I have the following situation (pseudocode):
function f:
pid = fork()
if pid == 0:
exec to another long-running executable (no communication needed to that process)
else:
return "something"
f is exposed over a XmlRpc++ server. When the function is called over XML-RPC, the parent process prints "done closing socket" after the function returned "something". But the XML-RPC client hangs as long as the child process is still running. When I kill the child process, the XML-RPC client correctly finishes the RPC call.
It seems to me that I'm having a problem with fork() copying socket descriptors to the child process (parent called closesocket but child still owns a reference -> connection still established). How can I circumvent this?
EDIT: I read about FD_CLOEXEC already, but can't I force all descriptors to be closed on exec?

No, you can't force all file descriptors to be closed on exec. You will need to loop over all unwanted file descriptors in the child after the fork() and close them. Unfortunately, there isn't an easy, portable, way to do that - the usual approach is to use getrlimit() to get the current value of RLIMIT_NOFILE and loop from 3 to that number, trying close() on each candidate.
If you are happy to be Linux-only, you can read the /proc/self/fd/ directory to determine the open file descriptors and close them (except 0, 1 and 2 - which should either be left alone or reopened to /dev/null).

Related

File descriptor for ioctl call to make a controlling terminal

On linux to be able to control lifetime of processes forked off of my main process I'm making the main process be the session and group leader by calling setsid(). Then it looks like I need to have the main process make a controlling terminal for the process group, and then, once the main process terminates, all other processes in the process group will receive a SIGHUP. I tried calling open() for a regular file on the filesystem, but ioctl() refuses to accept this fd with 'Inappropriate file descriptor'. Is posix_openpt() what I should be using instead? The man page says that it'll create a pseudo-terminal and return a file descriptor for it. Do I even need an ioctl(fd, TIOCSCTTY, 0) call after posix_openpt(), or not using O_NOCTTY is all I really need? Thanks!
Do I even need an ioctl(fd, TIOCSCTTY, 0) call after posix_openpt(), or not using O_NOCTTY is all I really need?
I just tried on Ubuntu 18.04.5:
If you don't do that and the controlling process is closed, the systemd process becomes the new controlling process of the child process and the child process does not receive SIGHUP.
I'm not sure if this behavior is the same for other Linux distributions, too.
Is posix_openpt() what I should be using instead?
Try the following code:
int master, tty;
master = posix_openpty(O_RDWR);
grantpt(master);
unlockpt(master);
tty = open(ptsname(master), O_RDWR);
ioctl(tty, TIOCSCTTY, 0);
This must be done in the same process that called setsid().
Note: As soon as you completely close the master file, the processes will receive a SIGHUP.
("Completely" means: When you close all copies created by dup() or by creating a child process inheriting the handle.)
If you really want to use the pseudo-TTY, you should not inherit the master handle to child processes (or close() the handle in a child process. However, in your case you only want to use the pseudo-TTY as "workaround", so this is not that important.

How do different processes share file descriptors?

When forking a process, consider the following scenario:
1) We open two pipes for IPC bidirection communication
2) Suppose these have (3,4) and (5,6) as file descriptors.
3) We fork the process somewhere in the middle.
4) We exec the child process
Now, the thing that happens is that these two processes are completely independent of each other and the-then child process is now having it's own address space and is a completely new process.
Now, my question is, how do pipes(/file descriptor) live in an Execed processes? Because, pipes opened like this are used for the execed and the parent process to communicate.
The only way I see this could happen is when the file descriptors are global to the machine, which I think is impossible to happen, as that would be conflicting.
And in the IDE for this code:
import os
from multiprocessing import Process, Pipe
def sender(pipe):
"""
send object to parent on anonymous pipe
"""
pipe.send(['spam']+[42, 'eggs'])
pipe.close()
def talker(pipe):
"""
send and receive objects on a pipe
"""
pipe.send(dict(name = 'Bob', spam = 42))
reply = pipe.recv()
print('talker got: ', reply)
if __name__ == '__main__':
(parentEnd, childEnd) = Pipe()
Process(target = sender, args = (childEnd,)).start()
print("parent got: ", parentEnd.recv())
parentEnd.close()
(parentEnd, childEnd) = Pipe()
child = Process(target = talker, args = (childEnd,))
##############################from here
child.start()
print('From talker Parent got:', parentEnd.recv())
parentEnd.send({x * 2 for x in 'spam'})
child.join()
############################## to here
print('parent exit')
There are two processes run, but only the output from one process can be seen in the idle, not two processes. However, in the terminal, it's like the stdout is also shared.
The actual job of copying process file descriptor table which is regulated by more generic clone() syscall flag CLONE_FILES (which is actually is not set by the fork()):
CLONE_FILES (since Linux 2.0)
...
If CLONE_FILES is not set, the child process inherits a copy of all file
descriptors opened in the calling process at the time of clone(). (The dupli-
cated file descriptors in the child refer to the same open file descriptions
(see open(2)) as the corresponding file descriptors in the calling process.)
Subsequent operations that open or close file descriptors, or change file
descriptor flags, performed by either the calling process or the child process
do not affect the other process.
execve() doesn't touch file descriptors except when file is opened or marked with O_CLOEXEC or FD_CLOEXEC flags, in which case those descriptors will be closed:
* By default, file descriptors remain open across an execve(). File
descriptors that are marked close-on-exec are closed; see the
description of FD_CLOEXEC in fcntl(2).

Linux pipe example . ipc pipe creation

I was looking through the pipe(2) syscall example in linux, I got this from tldp: http://tldp.org/LDP/lpg/node11.html#SECTION00722000000000000000
When we need to close the input of child process we close fd(1) of child - fine, but we should also close the output of the parent i.e. close fd(0) of parent, why should we use else statement here, in this case the parent's fd(0) will close only when the fork fails, am I correct?
I feel there should not be else statement and both the input of child and output of parent should be closed for communication from child to parent correct?
You shouldn't talk about child input and parent output, that looks like you are referring to stdin and stdout, which is not necessarily the same as the pipe's read and write channels.
For communication from child to parent, the child needs to close the pipe's read channel (fd[0] in your example), and the parent needs to close the pipe's write channel (fd[1]).
Your confusion seems to be more about forking than about pipes.
The else is needed, because we need to execute different code in the parent and in the child. It is very common to use if / else after forking to differentiate the code that executes in each process. Remember that fork(2) returns twice: in the parent, and in the newborn child. It returns the child's pid in the parent, and 0 in the child, so we use that to tell them apart.
In the example you posted, if fork(2) fails, the first if is entered and the process exits. Otherwise, a pair of if / else is used to execute different code in each process.

Can I prevent a script from launching twice using open(2) with O_CREAT and flock(2)?

I would like to prevent a script from launching twice by using a PID file. There are many ways to implement exclusivity, but since my script will always run on a Linux machine and I would like to be able to detect stale PID files automatically, I would like to use flock(2) to implement this.
I was told long ago by a colleague that the following pseudocode is the right way to do this (open(..., 'w') means "open in write mode with O_CREAT"):
fd = open(lockfile, 'w');
write(fd, pid);
close(fd);
fd = open(lockfile);
flock(fd)
file_pid = read(fd)
if file_pid != pid:
exit(1)
// do things
I am curious why he suggested the above instead of:
fd = open(lockfile, 'w')
flock(fd)
// do things
Presumably he suggested this because he thought the "create file if it doesn't exist" functionality of open(2) with O_CREAT is not atomic, that is, two processes who call open(2) at exactly the same time might get handles to two different files because the file creation is not exclusive.
My question is, is the latter code always correct on a Linux system, or if not, when is it not correct?
flock is not 100% reliable: http://en.wikipedia.org/wiki/File_locking#Problems
The 1st recipe is rather intrusive in the sense that a subsequent invocation of the process could blindly overwrite the pid data written by the previous invocation effectively preventing the 1st process from running. At high repeated invocation rates it's thus possible for none of the processes to run.
To ensure file creation exclusivity use O_CREAT | O_EXCL. You'd need to handle untimely process death leaving the file behind, tho.
I'd suggest 2 files:
a lock file opened with O_CREAT | O_EXCL, used just for protecting the actual PID file operations, should exist for just very short periods of time, easy to decide if stale based on creation time.
the actual PID file
Each process waits for the lock file to disappear (cleans it when it becomes stale), then attempts to create the lock file (only one instance succeeds, the others wait), checks the PID file existence/content (cleans up and deletes it if stale), creates a new PID file if it decides to run, then deletes the lock file and runs/exits as decided.

Best practices for Linux socket programming

I have a simple C server that accepts connection to which I try connecting using telnet or netcat. Each time I receive a connection, I print out the descriptor and close the connection in a child process.
I run an instance of netcat, connect to the server, disconnect(Ctrl^C), and repeat this a few times. The values printed at the server side on the descriptors used are 4,5,6,7 .. and it goes on increasing.
I tried repeating this exercise after a period of time and the values still keep increasing. I'm concerned that my descriptions aren't closing(despite an explicit call to close).
Is there some signal I should be handling, setting the handler to close the connection?
After a fork the child process has a copied set of the parent's file descriptors. So the proper procedure is, after the fork, to (1) close the parent's listening socket in the child and to (2) close the new connection socket inherited by the child in the parent.
Open file descriptors are reference counted by the kernel. So when the child inherits the connection socket the reference count is 2. After the parent closes the connection socket the count remains at 1 until the child is done and closes it. The reference count having then dropped to 0 the connection is then closed. (Some details omitted.)
The upshot is that after making this change you then see a lot of FDs equal to 4 in the parent because the same FD will continue to be opened/closed/reused even though multiple connections are being processed by the children.
after fork both parent and child have a copy of the socket file descriptor;
you should close the socket in the parent process after fork.
Now that does not close the connection, only when the child process closes the socket too, this then closes the connection.

Resources