Linux, cancel blocking read() - linux

In a multi-threaded Linux program used for serial communication, is it possible (and what would be the best approach) to terminate a blocking read() call from another thread?
I would like to keep everything as reactive as possible and avoid any use of timeouts with repeated polling.
The background of this question is that I'm trying to create a Scala serial communication library for Linux using JNI. I'm trying to keep the native side as simple as possible providing, amongst others, a read() and close() function. On the Scala side, one thread would call read() and block until data from the serial port is available. However, the serial port can be closed by other means, resulting in a call to close(). Now, to free up the blocked thread, I would somehow need to cancel the system read call.

One fairly popular trick: instead of blocking in read(), block in select() on both your serial-socket and a pipe. Then when another thread wants to wake up your thread, it can do so by writing a byte to the other end of that pipe. That byte will cause select() to return and your thread can now cleanup and exit or whatever it needs to do. (Note that to make this work 100% reliably you'll probably want to set your serial-socket to be non-blocking, to ensure that your thread only blocks in select() and never in read())

AFAIK signals are the only way to break any thread out of a blocking system call.
Use a pthread_kill() aimed at the thread with a USR1 signal.

You could probably do fake data input:
tty_ioctl(fd,TIOCSTI,"please unblock!");
Before calling it you should set some global flag, in order be able to check after 'read(...)' returns, if received data are just wake up goo or rather something more important.
Source: https://www.systutorials.com/docs/linux/man/4-tty_ioctl/

Related

How can I block a single thread for 3 different events (semaphore, pthread condition, and blocking socket recv)?

I have a multi-threaded system in which a main thread has to wait in blocking state for one of the following 4 events to happen:
inter-process semaphore (sem_wait())
pthread condition (pthread_cond_wait())
recv() from socket
timeout expiring
Ideally I'd like a mechanism to unblock the main thread when any of the above occurs, something like a ppoll() with suitable timeout parameter. Non-blocking and polling is out of the picture due to the impact on the CPU usage, while having separate threads blocking on different events is not ideal due to the increased latency (one thread unblocking from one of the events should eventually wake up the main one).
The code will be almost exclusively compiled under Linux with gcc toolchain, if that helps, but some portability would be good, if at all possible.
Thanks in advance for any suggestion
The mechanisms for waiting on multiple types of objects on Unix-like systems are not that great. In general, the idea is to, wherever possible, use file descriptors for IPC rather than multiple different IPC mechanisms.
From your comment, it sounds like you can edit or change the condition variable, but not the code that signals the semaphore. So what I'd recommend is something like the following.
Change the condition variable to either a pipe (for more portability) or an eventfd(2) object (Linux-specific). The notifying thread writes to the pipe whenever it wants to signal the main thread. This will allow you to select(2) or poll(2) or whatever in the main thread on both that pipe and the socket.
Because you're stuck with the semaphore, I think the best option would be to create another thread, whose sole purpose is to wait for the semaphore using sem_wait(), and then write to another pipe or eventfd(2) object when it is notified by whatever process is doing sem_post(). In the main thread, just add this other file descriptor to your select(2) set.
So you'll have three descriptors: one for the socket, one taking the place of the condition variable, and one which is written to when the semaphore is incremented. You can then wait on all three using your favorite I/O multiplexing method, and include directly whatever timeout you'd like.

Aborting a Read() call from another goroutine

I'm working on an IMAP server, and one of the operations is to upgrade the connection to use TLS (via the STARTTLS command). Our current architecture has one goroutine reading data from the socket, parsing the commands, and then sending logical commands over a channel. Another goroutine reads from that channel and executes the commands. This works great in general.
When executing STARTTLS, though, we need to stop the current in-progress Read() call, otherwise that Read() will consume bytes from the TLS handshake. We can insert another class in between, but then that class will be blocked on the Read() call and we have the same problem. If the network connection were a channel, we could add another signal channel and use a select{} block to stop reading, but network connections aren't channels (and simply wrapping it in a goroutine and channel just moves the problem to that goroutine).
Is there any way to stop a Read() call once it's begun, without waiting for a timeout to expire or something similar?
Read() call relies on your operating system behaviour under the hood. And its behaviour relies on socket behaviour.
If you're familiar with socket interface (which is almost a standard between operating systems with some small differences), you'll see that using synchronous communication mode for socket, read system call always blocks thread's execution until time out value expires, and you can't change this behaviour.
Go uses synchronous I/O under the hood for all its needs because goroutines make asynchronous communication unneeded by design.
There's also a way to break read: by shutting down the socket manually, which is not the best design decision for ones code, and in your specific case. So you should better play with smaller timeouts I think, or redesign your code to work in some other way.
There's really not much you can do to stop a Read call, unless you SetReadDeadline, or Close the connection.
One thing you could do is buffer it with the bufio package. This will allow you to Peek without actually reading something off the buffer. Peek will block just like Read does, but will allow you to decide what to do when something is available to read.

should socket be set NON-BLOCKING before it is polled by select()?

I have the memory that when we want to use select() over a socket descriptor, this socket should be set NONBLOCKING in advance.
but today, I read a source file where there seems no lines which set socket to NON-BLOCKING
Is my memory correct or not?
thanks!
duskwuff has the right idea when he says
In general, you do not need to set a socket as non-blocking to use it
in select().
This is true if your kernel is POSIX compliant with regard to select(). Unfortunately, some people use Linux, which is not, as the Linux select() man page says:
Under Linux, select() may report a socket file descriptor as "ready for
reading", while nevertheless a subsequent read blocks. This could for
example happen when data has arrived but upon examination has wrong
checksum and is discarded. There may be other circumstances in which a
file descriptor is spuriously reported as ready. Thus it may be safer
to use O_NONBLOCK on sockets that should not block.
There was a discussion of this on lkml on or about Sat, 18 Jun 2011. One kernel hacker tried to justify the non POSIX compliance. They honor POSIX when it's convenient and desecrate it when it's not.
He argued "there may be two readers and the second will block." But such an application flaw is non sequiter. The kernel is not expected to prevent application flaws. The kernel has a clear duty: in all cases of the first read() after select(), the kernel must return at least 1 byte, EOF, or an error; but NEVER block. As for write(), you should always test whether the socket is reported writable by select(), before writing. This guarantees you can write at least one byte, or get an error; but NEVER block. Let select() help you, don't write blindly hoping you won't block. The Linux hacker's grumbling about corner cases, etc., are euphemisms for "we're too lazy to work on hard problems."
Suppose you read a serial port set for:
min N; with -icanon, set N characters minimum for a completed read
time N; with -icanon, set read timeout of N tenths of a second
min 250 time 1
Here you want blocks of 250 characters, or a one tenth second timeout. When I tried this on Linux in non blocking mode, the read returned for every single character, hammering the CPU. It was NECESSARY to leave it in blocking mode to get the documented behavior.
So there are good reasons to use blocking mode with select() and expect your kernel to be POSIX compliant.
But if you must use Linux, Jeremy's advice may help you cope with some of its kernel flaws.
It depends. Setting a socket as non-blocking does several things:
Makes read() / recv() return immediately with no data, instead of blocking, if there is nothing available to read on the socket.
If you are using select(), this is probably a non-issue. So long as you only read from a socket when select() tells you it is readable, you're fine.
Makes write() / send() return partial (or zero) writes, instead of blocking, if not enough space is available in kernel buffers.
This one is tricky. If your application is written to handle this situation, it's great, because it means your application will not block when a client is reading slowly. However, it means that your application will need to temporarily store writable data in its own application-level buffers, rather than writing directly to sockets, and selectively place sockets with pending writes in the writefds set. Depending on what your application is, this may either be a lifesaver or a huge added complication. Choose carefully.
If set before the socket is connected, makes connect() return immediately, before a connection is actually made.
Similarly, this is sometimes useful if your application needs to make connections to hosts that may respond slowly while continuing to respond on other sockets, but can cause issues if you aren't careful about how you handle these half-connected sockets. It's usually best avoided (by only setting sockets as non-blocking after they are connected, if at all).
In general, you do not need to set a socket as non-blocking to use it in select(). The system call already lets you handle sockets in a basic non-blocking fashion. Some applications will need non-blocking writes, though, and that's what the flag is still needed for.
send() and write() block if you provide more data than can be fitted into the socket send buffer. Normally in select() programming you don't want to block anywhere except in select(), so you use non-blocking mode.
With certain Windows APIs it indeed essential to use non-blocking mode.
Usually when you are using select(), you are using it is the basis of an event loop; and when using an event loop you want the event loop to block only inside select() and never anywhere else. (The reason for that is so that your program will always wake up whenever there is something to do on any of the sockets it is handling -- if, for example, your program was blocked inside recv() for socket A, it would be unable to handle any data coming in on socket B until it got some data from socket A first to wake it up; and vice versa).
Therefore it is best to set all sockets non-blocking when using select(). That way there is no chance of your program getting blocked on a single socket and ignoring the other ones for an extended period of time.

How to interrupt a thread performing a blocking socket connect?

I have some code that spawns a pthread that attempts to maintain a socket connection to a remote host. If the connection is ever lost, it attempts to reconnect using a blocking connect() call on its socket. Since the code runs in a separate thread, I don't really care about the fact that it uses the synchronous socket API.
That is, until it comes time for my application to exit. I would like to perform some semblance of an orderly shutdown, so I use thread synchronization primitives to wake up the thread and signal for it to exit, then perform a pthread_join() on the thread to wait for it to complete. This works great, unless the thread is in the middle of a connect() call when I command the shutdown. In that case, I have to wait for the connect to time out, which could be a long time. This makes the application appear to take a long time to shut down.
What I would like to do is to interrupt the call to connect() in some way. After the call returns, the thread will notice my exit signal and shut down cleanly. Since connect() is a system call, I thought that I might be able to intentionally interrupt it using a signal (thus making the call return EINTR), but I'm not sure if this is a robust method in a POSIX threads environment.
Does anyone have any recommendations on how to do this, either using signals or via some other method? As a note, the connect() call is down in some library code that I cannot modify, so changing to a non-blocking socket is not an option.
Try to close() the socket to interrupt the connect(). I'm not sure, but I think it will work at least on Linux. Of course, be careful to synchronize properly such that you only ever close() this socket once, or a second close() could theoretically close an unrelated file descriptor that was just opened.
EDIT: shutdown() might be more appropriate because it does not actually close the socket.
Alternatively, you might want to take a look at pthread_cancel() and pthread_kill(). However, I don't see a way to use these two without a race condition.
I advise that you abandon the multithreaded-server approach and instead go event-driven, for example by using epoll for event notification. This way you can avoid all these very basic problems that become very hard with threads, like proper shutdown. You are free to at any time do anything you want, e.g. safely close sockets and never hear from them again.
On the other hand, if in your worker thread you do a non-blocking connect() and get notified via epoll_pwait() (or ppoll() or pselect(); note the p), you may be able to avoid race conditions associated with signals.

Server running in linux kernel. Should listen happen in a thread or not?

I am writing a client/server in linux kernel (Yes. Inside the kernel. Its design decision taken and finalised. Its not going to change)
The server reads incoming packets from a raw socket. The transport protocol for these packets (on which the raw socket is listening) is custom and UDP like. In short I do not have to listen for incoming connections and then fork a thread to handle that connection.
I have to just process any IP datagram coming on that raw socket. I will keep reading for packets in an infinite loop on the raw socket. In the user-level equivalent program, I would have created a separate thread and kept listening for incoming packets.
Now for kernel level server, I have doubts about whether I should run it in a separate thread or not because:
I think read() is an I/O operation. So somewhere inside the read(), kernel must be calling schedule() function to relinquish the control of the processor. Thus after calling read() on raw socket, the current kernel active context will be put on hold (put in a sleep queue maybe?) until the packets are available. As and when packets will arrive, the kernel interrupt context will signal that the read context, which is sleeping in the queue, is once again ready to run. I am using 'context' here on purpose instead of 'thread'. Thus I should not require a separate kernel thread.
On the other hand, if read() does not relinquish the control then entire kernel will be blocked.
Can anyone provide tips about how should I design my server?
What is the fallacy of the argument presented in point 1?
I'm not sure whether you need a raw socket at all in the kernel. Inside the kernel you can add a netfilter hook, or register something else (???) which will receive all packets; this might be what you want.
If you DID use a raw socket inside the kernel, then you'd probably need to have a kernel thread (i.e. started by kernel_thread) to call read() on it. But it need not be a kernel thread, it could be a userspace thread which just made a special syscall or device call to call the desired kernel-mode routine.
If you have a hook registered, the context it's called in is probably something which should not do too much processing; I don't know exactly what that is likely to be, it may be a "bottom half handler" or "tasklet", whatever the are (these types of control structures keep changing from one version to another). I hope it's not actually an interrupt service routine.
In answer to your original question:
Yes, sys_read will block the calling thread, whether it's a kernel thread or a userspace one. The system will not hang. However, if the calling thread is not in a state where blocking makes sense, the kernel will panic (scheduling in interrupt or something)
Yes you will need to do this in a separate thread, no it won't hang the system. However, making system calls in kernel mode is very iffy, although it does work (sort of).
But if you installed some kind of hook instead, you wouldn't need to do any of that.
I think your best bet might be to emulate the way drivers are written, think of your server as a virtual device sitting on top of the ones that the requests are coming from. Example: a mouse driver accepts continuous input, but doesn't lock the system if programmed correctly, and a network adapter is probably more similar to your case.

Resources