I have some code that spawns a pthread that attempts to maintain a socket connection to a remote host. If the connection is ever lost, it attempts to reconnect using a blocking connect() call on its socket. Since the code runs in a separate thread, I don't really care about the fact that it uses the synchronous socket API.
That is, until it comes time for my application to exit. I would like to perform some semblance of an orderly shutdown, so I use thread synchronization primitives to wake up the thread and signal for it to exit, then perform a pthread_join() on the thread to wait for it to complete. This works great, unless the thread is in the middle of a connect() call when I command the shutdown. In that case, I have to wait for the connect to time out, which could be a long time. This makes the application appear to take a long time to shut down.
What I would like to do is to interrupt the call to connect() in some way. After the call returns, the thread will notice my exit signal and shut down cleanly. Since connect() is a system call, I thought that I might be able to intentionally interrupt it using a signal (thus making the call return EINTR), but I'm not sure if this is a robust method in a POSIX threads environment.
Does anyone have any recommendations on how to do this, either using signals or via some other method? As a note, the connect() call is down in some library code that I cannot modify, so changing to a non-blocking socket is not an option.
Try to close() the socket to interrupt the connect(). I'm not sure, but I think it will work at least on Linux. Of course, be careful to synchronize properly such that you only ever close() this socket once, or a second close() could theoretically close an unrelated file descriptor that was just opened.
EDIT: shutdown() might be more appropriate because it does not actually close the socket.
Alternatively, you might want to take a look at pthread_cancel() and pthread_kill(). However, I don't see a way to use these two without a race condition.
I advise that you abandon the multithreaded-server approach and instead go event-driven, for example by using epoll for event notification. This way you can avoid all these very basic problems that become very hard with threads, like proper shutdown. You are free to at any time do anything you want, e.g. safely close sockets and never hear from them again.
On the other hand, if in your worker thread you do a non-blocking connect() and get notified via epoll_pwait() (or ppoll() or pselect(); note the p), you may be able to avoid race conditions associated with signals.
Related
I'm doing select() on a blocking socket with no timeout select(sock+1, &rfd, NULL, NULL, NULL).
This happens in a thread whose objective is to dispatch incoming data. Another surveillance thread is managing a keep alive with the peer and when it detects a dead connection, it would close the socket.
I was expecting select() to return with -1 in that case. It does that on Windows but never on Linux, so the dispatch thread is locked forever when the peer disappear non-gracefully. For completeness, there is pending data to be transmitted on that, I've tried to play with SO_LINGER but that does not change anything.
The problem can be solved by setting a timeout in select() and in that case after close and timeout, select() ultimately exits with -1, but I thought, reading the doc, that select() with no timeout would still exit on close, even when the peer is not responding.
Do I misuse select() or is there a better way to handle half-open sockets ?
Yes you misuse the select. The man select states:
If a file descriptor being monitored by select() is closed in another thread, the result is unspecified. On some UNIX systems, select() unblocks and returns, with an indication that the file descriptor is ready (a subsequent I/O operation will likely fail with an error, unless another the file descriptor reopened between the time select() returned and the I/O operations was performed). On Linux (and some other systems), closing the file descriptor in another thread has no effect on select(). In summary, any application that relies on a particular behavior in this scenario must be considered buggy.
So you cannot close connection from other thread. Unfortunately the poll has the same issue.
EDIT
There are several possible solution and I have not sufficient information about your application. Following changed can be considered:
Use epoll instead of select if you are on linux or other modern polling mechanism if you on another OS. select is quite old function and it was designed in time when threading was not considered seriously.
Establish a communication channel between the select thread and the keep-alive thread. When keep alive thread detects a dead peer then don't close the socket itself but instructs the select thread to do that. Typically it can be done through a local socket. The local socket is added to select descriptor set and when the keep-alive thread writes something to it the select thread wakes up and can take an action.
I'm working on an IMAP server, and one of the operations is to upgrade the connection to use TLS (via the STARTTLS command). Our current architecture has one goroutine reading data from the socket, parsing the commands, and then sending logical commands over a channel. Another goroutine reads from that channel and executes the commands. This works great in general.
When executing STARTTLS, though, we need to stop the current in-progress Read() call, otherwise that Read() will consume bytes from the TLS handshake. We can insert another class in between, but then that class will be blocked on the Read() call and we have the same problem. If the network connection were a channel, we could add another signal channel and use a select{} block to stop reading, but network connections aren't channels (and simply wrapping it in a goroutine and channel just moves the problem to that goroutine).
Is there any way to stop a Read() call once it's begun, without waiting for a timeout to expire or something similar?
Read() call relies on your operating system behaviour under the hood. And its behaviour relies on socket behaviour.
If you're familiar with socket interface (which is almost a standard between operating systems with some small differences), you'll see that using synchronous communication mode for socket, read system call always blocks thread's execution until time out value expires, and you can't change this behaviour.
Go uses synchronous I/O under the hood for all its needs because goroutines make asynchronous communication unneeded by design.
There's also a way to break read: by shutting down the socket manually, which is not the best design decision for ones code, and in your specific case. So you should better play with smaller timeouts I think, or redesign your code to work in some other way.
There's really not much you can do to stop a Read call, unless you SetReadDeadline, or Close the connection.
One thing you could do is buffer it with the bufio package. This will allow you to Peek without actually reading something off the buffer. Peek will block just like Read does, but will allow you to decide what to do when something is available to read.
In a multi-threaded Linux program used for serial communication, is it possible (and what would be the best approach) to terminate a blocking read() call from another thread?
I would like to keep everything as reactive as possible and avoid any use of timeouts with repeated polling.
The background of this question is that I'm trying to create a Scala serial communication library for Linux using JNI. I'm trying to keep the native side as simple as possible providing, amongst others, a read() and close() function. On the Scala side, one thread would call read() and block until data from the serial port is available. However, the serial port can be closed by other means, resulting in a call to close(). Now, to free up the blocked thread, I would somehow need to cancel the system read call.
One fairly popular trick: instead of blocking in read(), block in select() on both your serial-socket and a pipe. Then when another thread wants to wake up your thread, it can do so by writing a byte to the other end of that pipe. That byte will cause select() to return and your thread can now cleanup and exit or whatever it needs to do. (Note that to make this work 100% reliably you'll probably want to set your serial-socket to be non-blocking, to ensure that your thread only blocks in select() and never in read())
AFAIK signals are the only way to break any thread out of a blocking system call.
Use a pthread_kill() aimed at the thread with a USR1 signal.
You could probably do fake data input:
tty_ioctl(fd,TIOCSTI,"please unblock!");
Before calling it you should set some global flag, in order be able to check after 'read(...)' returns, if received data are just wake up goo or rather something more important.
Source: https://www.systutorials.com/docs/linux/man/4-tty_ioctl/
I need to do a project where the application monitors incoming connections and apply some rules as defined in a xml document. The rules are either filtering (blocking or permitting) connections or redirect traffic on a certain port. In order to do this, I use functions such as accept and recv (from Winsock). All of those functions are used on different threads. I'm wondering, though, how am I supposed to clean up the program before exiting since all those blocking calls are made. Normally I'd either wait until the person exits the console through the X button or waiting for the user to input a certain character in the main thread. The thing is I'm not sure what happens if the application exits while there are still active threads/if memory is still allocated/ if sockets are in use. Are all destructors called? Are h andles and sockets correctly closed? Or do I need to somehow do it myself?
Thanks
In general, I would say no. Do not try to explicitly clean up resources like sockets, fd's, handles, threads unless you are absolutely forced to.
Exact behaviour depends on OS and how you terminate your app.
All the common desktop OS will release resources allocated to a process by the OS when a process terminates. This includes sockets, file descriptors, memory.
On Windows/Linux, if you return from your C/C++ main() without any explicit cleanup, static dtors will get called by the crt code. Dtors for dynamically allocated objects in non-main threads are not run.
Executables written in other languages may behave differently.
If, instead of returning from main(), you call a 'ProcessExit()' API directly, static destructors will not get called because the OS has no concept of dtors - it has no idea, or interest, in what language was used to generate the executable.
In either case, the OS will be called to terminate your process. The OS does this, (simple 'Dummies' version:), by first changing the state of all process threads that are not running so that they never run again. Threads that are running on other cores are then stopped. Then OS resources like fd, sockets are closed, then released, then all process memory is freed, then OS kernel process/thread objects freed, then your process no longer exists.
If you absolutely need some, or all, C++/whatever dtors called when some thread needs to stop the app, you will have to explcitly signal other threads to stop so that dtors can be run. I tend to use a globally-accessible 'CloseRequested' bool that relevant blocking calls check immediately after returning. There remains the issue of persuading the blocking calls to return.
Some blocking calls can be coded up to wait on more than one signal, so allowing the call to return by a simple event/sema/condvar/whatever signal.
Some calls, like recv(), accept(), can be pesuaded to return early by closing the fd/socket they are waiting on.
Some calls can be made to return by 'artificially' satisfying their wait condition - eg. creating a temp file just to make a folder-monitor call return so that the 'CloseRequested' bool can be checked.
If a blocking call is so annoyingly stubborn that it cannot be persuaded to return, you could redesign your app so that whatever the critical resource is that is released in the dtors can be released by another thread - maybe create the thing in another thread and pass it to the thread that blocks in a ctor parameter, something like that.
NOTE WELL: Thread shutdown code bodges, as listed above, are extra code that does not add to the normal functionality of your app. You should restrict explicit thread shutdown to those threads that hold resources that absolutely must be released by explicit user code - DB connections, say. If the OS can release the resource, it should be allowed to do so. The OS is very good at stopping all process threads before releasing resources they are using, user code is not.
Where possible, use blocking calls that take a timeout value, and have your threads loop. That gives you a place to check for a shutdown condition and exit the thread gracefully. Handles will generally be cleaned up by the system when the process exits. It is polite to shut down sockets gracefully, but not absolutely mandatory. The downside of not doing so is it can take a while for the kernel to clean up exclusive resources. For example, if you just kill a thread waiting to accept(), and then your app re-launches, it won't be able to successfully accept() on the same port until the kernel cleans up the old socket.
I've got a service that I need to shut down and update. I'm having difficulties with this in two different cases:
I have some threads that sleep for large amounts of time. Obviously I can't wait for them to wake up to finish shutting down the service. I had a thought to use an AutoResetEvent that gets set by some controller thread when the sleep interval is up (by just checking every two seconds or something), and triggering it immediately at OnClose time. Is there a better way to facilitate that?
I have one thread that makes a call to a blocking method call (one which I cannot modify). How do you signal such a thread to stop?
I'm not sure if I understood your first question correctly, but have you looked at using WaitForSingleObject as an alternative to Sleep? You can specify a timeout as well as an object to wait on, so if you want it to wake up earlier, just signal the object.
What exactly do you mean by "call to a blocking thread"? Or did you just mean a blocking call? In general, there isn't a way to interrupt a thread without forcefully terminating it. However, if the call is a system call, there might be ways to return control by making the call fail, eg. cancelling I/O or closing an associated handle.
For 1. you can get your threads into an interruptable Sleep by using SleepEx rather than Sleep. Once they get this shutdown kick (initiated from your termination logic using QueueUserApc), you can detect it happened using the return code from SleepEx and terminate those threads accordingly. This is similar to the suggestion to use WaitForSingleObject, but you don't need another per-thread handle that's just used to terminate the associated thread.
The return value is zero if the
specified time interval expired.
The return value is WAIT_IO_COMPLETION
if the function returned due to one or
more I/O completion callback
functions. This can happen only if
bAlertable is TRUE, and if the thread
that called the SleepEx function is
the same thread that called the
extended I/O function.
For 2., that's a tough one unless you have access to some resource used in that thread that can cause the blocking call to abort in such a way that the calling thread can handle it cleanly. You may just have to implement code to kill that thread with extreme prejudice using TerminateThread (probably this should be the last thing you do before exiting the process) and see what happens under test.
An easy and reliable solution is to kill the service process. A process is the memory-safe abstraction of the OS, after all, so you can safely terminate one without regard for process-internal state - of course, if your process is communicating or fiddling with external state, all bets are off...
Additionally, you could implement the solution which OS's themselves commonly do: one warning signal asking the process to clean up as best possible (which sets a flag and gracefully exits what can be gracefully stopped), and then forceful termination if the process doesn't exit by itself (which ends pesky things like blocking I/O).
All services should be built such that forceful termination isn't harmful, since these processes are system managed and may be terminated by things such as a reboot - i.e., your service ideally should permit this without corrupting storage anyhow.
Oh, and one final warning; windows services may share a process (I presume for efficiency, though it strikes me as an avoidable optimization), so if you go this route, you want to make sure your service is not sharing a process with other services. You can ensure this by passing the option SERVICE_WIN32_OWN_PROCESS to ChangeServiceConfig.