Sys.set_signal interrupts input_line on the main thread, but not in children threads - multithreading

What is the proper way to write an interruptible reader thread in OCaml? Concretely, the following single-threaded program works (that is, Ctrl-C Ctrl-C interrupts it immediately):
exception SigInt
let _ =
Sys.set_signal Sys.sigint (Sys.Signal_handle (fun _ -> raise SigInt));
try output_string stdout (input_line stdin);
with SigInt -> print_endline "SINGLE_SIGINT"
The following program, on the other hand, cannot be interrupted with C-c C-c:
let _ =
Sys.set_signal Sys.sigint (Sys.Signal_handle (fun _ -> raise SigInt));
let go () =
try output_string stdout (input_line stdin);
with SigInt -> print_endline "CHILD_SIGINT" in
try Thread.join (Thread.create go ());
with SigInt -> print_endline "PARENT_SIGINT"
What's a cross-platform way to implement an interruptible reader thread in OCaml?. That is, what changes do I need to make to the multithreaded program above to make it interruptible?
I've explored multiple hypotheses to understand why the multi-threaded example above was not working, but none made sense full to me:
Maybe input_line isn't interruptible? But the the single-threaded example above would not work.
Maybe Thread.join is blocking the signal for the whole process? But in that case the following example would not be interruptible either:
let _ =
Sys.set_signal Sys.sigint (Sys.Signal_handle (fun _ -> raise SigInt));
let rec alloc acc =
alloc (1::acc) in
let go () =
try alloc []
with SigInt -> print_endline "CHILD_SIGINT" in
try Thread.join (Thread.create go ());
with SigInt -> print_endline "PARENT_SIGINT"
…and yet it is: pressing Ctrl-C Ctrl-C exits immediately.
Maybe the signal is delivered to the main thread, which is waiting uninterruptibly in Thread.join. If this was true, pressing Ctrl-C Ctrl-C then Enter would print "PARENT_SIGINT". But it doesn't: it prints "CHILD_SIGINT", meaning that the signal was routed to the child thread and delayed until input_line completed. Surprisingly, though, this works (and it prints CHILD_SIGINT)
let multithreaded_sigmask () =
Sys.set_signal Sys.sigint (Sys.Signal_handle (fun _ -> raise SigInt));
let go () =
try
ignore (Thread.sigmask Unix.SIG_SETMASK []);
output_string stdout (input_line stdin);
with SigInt -> print_endline "CHILD_SIGINT" in
try
ignore (Thread.sigmask Unix.SIG_SETMASK [Sys.sigint]);
Thread.join (Thread.create go ());
with SigInt -> print_endline "PARENT_SIGINT"
… but sigmask is not available on Windows.

Two things are working together to make the behavior hard to understand. The first is OS signal delivery to the process. The second is how the OCaml runtime delivers them to the application code.
Looking at the OCaml source code, its OS signal handler simply records the fact that a signal was raised, via a global variable. That flag is then polled by other parts of the OCaml runtime, at times when it is safe to deliver the signal. So the Thread.sigmask controls which thread(s) the OS signal can be delivered on, to the OCaml runtime. It does not control delivery to your app.
Pending signals are delivered by caml_process_pending_signals(), which is called by caml_enter_blocking_section() and caml_leave_blocking_section(). There is no thread mask or affinity here... the first thread to process the global list of pending signals does so.
The input_line function polls the OS for fresh input, and each time it does, it enters and leaves the blocking section, so it is polling frequently for signals.
Thread.join enters the blocking section, then blocks indefinitely, until the thread is finished, then leaves the blocking section. So while it is waiting, it is not polling for pending signals.
In your first interruptable example, what happens if you actually type and hit enter? Does the input_line call actually accumulate input and return it? It may not.. the Thread.join may own the blocking section and be preventing input and signal delivery process-wide.

Related

Waiting on main thread for callback methods

I am very new to Scala and following the Scala Book Concurrency section (from docs.scala-lang.org). Based off of the example they give in the book, I wrote a very simple code block to test using Futures:
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
import scala.util.{Failure, Success}
object Main {
def main(args: Array[String]): Unit = {
val a = Future{Thread.sleep(10*100); 42}
a.onComplete {
case Success(x) => println(a)
case Failure(e) => e.printStackTrace
}
Thread.sleep(5000)
}
}
When compiled and run, this properly prints out:
Future(Success(42))
to the console. I'm having trouble wrapping my head around why the Thread.sleep() call comes after the onComplete callback method. Intuitively, at least to me, would be calling Thread.sleep() before the callback so by the time the main thread gets to the onComplete method a is assigned a value. If I move the Thread.sleep() call to before a.onComplete, nothing prints to the console. I'm probably overthinking this but any help clarifying would be greatly appreciated.
When you use the Thread.sleep() after registering the callback
a.onComplete {
case Success(x) => println(a)
case Failure(e) => e.printStackTrace
}
Thread.sleep(5000)
then the thread that is executing the body of the future has the time to sleep one second and to set the 42 as the result of successful future execution. By that time (after approx. 1 second), the onComplete callback is already registered, so the thread calls this as well, and you see the output on the console.
The sequence is essentially:
t = 0: Daemon thread begins the computation of 42
t = 0: Main thread creates and registers callback.
t = 1: Daemon thread finishes the computation of 42
t = 1 + eps: Daemon thread finds the registered callback and invokes it with the result Success(42).
t = 5: Main thread terminates
t = 5 + eps: program is stopped.
(I'm using eps informally as a placeholder for some reasonably small time interval; + eps means "almost immediately thereafter".)
If you swap the a.onComplete and the outer Thread.sleep as in
Thread.sleep(5000)
a.onComplete {
case Success(x) => println(a)
case Failure(e) => e.printStackTrace
}
then the thread that is executing the body of the future will compute the result 42 after one second, but it would not see any registered callbacks (it would have to wait four more seconds until the callback is created and registered on the main thread). But once 5 seconds have passed, the main thread registers the callback and exits immediately. Even though by that time it has the chance to know that the result 42 has already been computed, the main thread does not attempt to execute the callback, because it's none of its business (that's what the threads in the execution context are for). So, right after registering the callback, the main thread exits immediately. With it, all the daemon threads in the thread pool are killed, and the program exits, so that you don't see anything in the console.
The usual sequence of events is roughly this:
t = 0: Daemon thread begins the computation of 42
t = 1: Daemon thread finishes the computation of 42, but cannot do anything with it.
t = 5: Main thread creates and registers the callback
t = 5 + eps: Main thread terminates, daemon thread is killed, program is stopped.
so that there is (almost) no time when the daemon thread could wake up, find the callback, and invoke it.
A lot of things in Scala are functions and don't necessarily look like it. The argument to onComplete is one of those things. What you've written is
a.onComplete {
case Success(x) => println(a)
case Failure(e) => e.printStackTrace
}
What that translates to after all the Scala magic is effectively (modulo PartialFunction shenanigans)
a.onComplete({ value =>
value match {
case Success(x) => println(a)
case Failure(e) => e.printStackTrace
}
})
onComplete isn't actually doing any work. It's just setting up a function that will be called at a later date. So we want to do that as soon as we can, and the Scala scheduler (specifically, the ExecutionContext) will invoke our callback at its convenience.

handling SIGINT (ctrl-c) in golang to convert it to a panic

My goal is to have a handler for SIGINT (i.e., Ctrl-C on the CLI) which will allow deferred function calls to run instead of causing a hard exit. The usecase for this is in a test suite with very long-running tests, I want the CLI user to be able to trigger test cleanup early using Ctrl-C if they want. The test cleanup functions should all be on the deferred function stacks of each of the test functions, so demoting SIGINT to a panic should, in my mind, cause those cleanup functions to run.
The code below is my attempt to do that. If you run this with go run ., you'll see
$ go run .
regular action ran!
post-Sleep action ran!
deferred action ran!
But if you interrupt it during the 5 seconds of sleep, you'll see this instead:
regular action ran!^Cpanic: interrupt
goroutine 8 [running]:
main.panic_on_interrupt(0xc00007c180)
/home/curran/dev/test/main.go:12 +0x5e
created by main.main
/home/curran/dev/test/main.go:20 +0xab
exit status 2
I added the interrupt handler and the goroutine because I thought that would de-escalate the SIGINT into a panic and allow the call to fmt.Printf("deferred action ran!") to execute. However, that did not end up being the case.
Here's the code in question:
package main
import (
"fmt"
"time"
"os"
"os/signal"
)
func panic_on_interrupt(c chan os.Signal) {
sig := <-c
panic(sig)
}
func main() {
c := make(chan os.Signal, 1)
// Passing no signals to Notify means that
// all signals will be sent to the channel.
signal.Notify(c, os.Interrupt)
go panic_on_interrupt(c)
fmt.Printf("regular action ran!")
defer fmt.Printf("deferred action ran!")
time.Sleep(5 * time.Second)
fmt.Printf("post-Sleep action ran!")
}
func main() {
fmt.Printf("regular action ran!")
defer fmt.Printf("deferred action ran!")
go startLongRunningTest()
defer longRunningTestCleanupCode()
//time.Sleep(5 * time.Second)
//fmt.Printf("post-Sleep action ran!")
c := make(chan os.Signal, 1)
signal.Notify(c, os.Interrupt)
<-c
}
time.Sleep() blocks running goroutine for the specified time.
You may defer() cleanup code.
Also, you can run time-consuming tests in separate goroutines instead of panicking there.
Avoid use of panic() if possible.

How to safely interact with channels in goroutines in Golang

I am new to go and I am trying to understand the way channels in goroutines work. To my understanding, the keyword range could be used to iterate over a the values of the channel up until the channel is closed or the buffer runs out; hence, a for range c will repeatedly loops until the buffer runs out.
I have the following simple function that adds value to a channel:
func main() {
c := make(chan int)
go printchannel(c)
for i:=0; i<10 ; i++ {
c <- i
}
}
I have two implementations of printchannel and I am not sure why the behaviour is different.
Implementation 1:
func printchannel(c chan int) {
for range c {
fmt.Println(<-c)
}
}
output: 1 3 5 7
Implementation 2:
func printchannel(c chan int) {
for i:=range c {
fmt.Println(i)
}
}
output: 0 1 2 3 4 5 6 7 8
And I was expecting neither of those outputs!
Wanted output: 0 1 2 3 4 5 6 7 8 9
Shouldnt the main function and the printchannel function run on two threads in parallel, one adding values to the channel and the other reading the values up until the channel is closed? I might be missing some fundamental go/thread concept here and pointers to that would be helpful.
Feedback on this (and my understanding to channels manipulation in goroutines) is greatly appreciated!
Implementation 1. You're reading from the channel twice - range c and <-c are both reading from the channel.
Implementation 2. That's the correct approach. The reason you might not see 9 printed is that two goroutines might run in parallel threads. In that case it might go like this:
main goroutine sends 9 to the channel and blocks until it's read
second goroutine receives 9 from the channel
main goroutine unblocks and exits. That terminates whole program which doesn't give second goroutine a chance to print 9
In case like that you have to synchronize your goroutines. For example, like so
func printchannel(c chan int, wg *sync.WaitGroup) {
for i:=range c {
fmt.Println(i)
}
wg.Done() //notify that we're done here
}
func main() {
c := make(chan int)
wg := sync.WaitGroup{}
wg.Add(1) //increase by one to wait for one goroutine to finish
//very important to do it here and not in the goroutine
//otherwise you get race condition
go printchannel(c, &wg) //very important to pass wg by reference
//sync.WaitGroup is a structure, passing it
//by value would produce incorrect results
for i:=0; i<10 ; i++ {
c <- i
}
close(c) //close the channel to terminate the range loop
wg.Wait() //wait for the goroutine to finish
}
As to goroutines vs threads. You shouldn't confuse them and probably should understand the difference between them. Goroutines are green threads. There're countless blog posts, lectures and stackoverflow answers on that topic.
In implementation 1, range reads into channel once, then again in Println. Hence you're skipping over 2, 4, 6, 8.
In both implementations, once the final i (9) has been sent to goroutine, the program exits. Thus goroutine does not have the time to print out 9. To solve it, use a WaitGroup as has been mentioned in the other answer, or a done channel to avoid semaphore/mutex.
func main() {
c := make(chan int)
done := make(chan bool)
go printchannel(c, done)
for i:=0; i<10 ; i++ {
c <- i
}
close(c)
<- done
}
func printchannel(c chan int, done chan bool) {
for i := range c {
fmt.Println(i)
}
done <- true
}
The reason your first implementation only returns every other number is because you are, in effect "taking" from c twice each time the loop runs: first with range, then again with <-. It just happens that you're not actually binding or using the first value taken off the channel, so all you end up printing is every other one.
An alternative approach to your first implementation would be to not use range at all, e.g.:
func printchannel(c chan int) {
for {
fmt.Println(<-c)
}
}
I could not replicate the behavior of your second implementation, on my machine, but the reason for that is that both of your implementations are racy - they will terminate whenever main ends, regardless of what data may be pending in a channel or however many goroutines may be active.
As a closing note, I'd warn you not to think about goroutines as explicitly being "threads", though they have a similar mental model and interface. In a simple program like this it's not at all unlikely that Go might just do it all using a single OS thread.
Your first loop does not work as you have 2 blocking channel receivers and they do not execute at the same time.
When you call the goroutine the loop starts, and it waits for the first value to be sent to the channel. Effectively think of it as <-c .
When the for loop in the main function runs it sends 0 on the Chan. At this point the range c recieves the value and stops blocking the execution of the loop.
Then it is blocked by the reciever at fmt.println(<-c) . When 1 is sent on the second iteration of the loop in main the recieved at fmt.println(<-c) reads from the channel, allowing fmt.println to execute thus finishing the loop and waiting for a value at the for range c .
Your second implementation of the looping mechanism is the correct one.
The reason it exits before printing to 9 is that after the for loop in main finishes the program goes ahead and completes execution of main.
In Go func main is launched as a goroutine itself while executing. Thus when the for loop in main completes it goes ahead and exits, and as the print is within a parallel goroutine that is closed, it is never executed. There is no time for it to print as there is nothing to block main from completing and exiting the program.
One way to solve this is to use wait groups http://www.golangprograms.com/go-language/concurrency.html
In order to get the expected result you need to have a blocking process running in main that provides enough time or waits for confirmation of the execution of the goroutine before allowing the program to continue.

How can I exit the program from a sigTERM handler?

Consider something like this:
...
handleShutdown :: ThreadId -> IO ()
handleShutdown tid = doSomethingFunny >> throwTo tid ExitSuccess
main = do
...
installHandler sigTERM (Catch $ myThreadId >>= handleShutdown) Nothing
forever $ do
stuff
...
If sigINT (Ctrl+C) is handled in this manner, the process finishes nicely. However, it seems like sigTERM is being used by Haskell internally and the code above doesn't exit from the main process at all. Is there a way to exit the process from a sigTERM handler without using an MVar and a custom loop? I couldn't find any information on the sigTERM handling anywhere (didn't read ghc sources, that's just too much for me to handle).
Update:
The following works:
main = do
...
tid <- myThreadId -- This moved out of the Catch handler below.
installHandler sigTERM (Catch $ handleShutdown tid) Nothing
forever $ do
stuff
...
Sorry for short answer, but on mobile.
You want to run myThreadId from outside of the handler itself to get the main thread's ID. You're currently getting the ID of the signal handler itself.

Safely close an indefinitely running thread

So first off, I realize that if my code was in a loop I could use a do while loop to check a variable set when I want the thread to close, but in this case that is not possible (so it seems):
DWORD WINAPI recv thread (LPVOID random) {
recv(ClientSocket, recvbuffer, recvbuflen, 0);
return 1;
}
In the above, recv() is a blocking function.
(Please pardon me if the formatting isn't correct. It's the best I can do on my phone.)
How would I go about terminating this thread since it never closes but never loops?
Thanks,
~P
Amongst other solutions you can
a) set a timeout for the socket and handle timeouts correctly by checking the return values and/or errors in an appropriate loop:
setsockopt(ClientSocket,SOL_SOCKET,SO_RCVTIMEO,(char *)&timeout,sizeof(timeout))
b) close the socket with recv(..) returning from blocked state with error.
You can use poll before recv() to check if some thing there to receive.
struct pollfd poll;
int res;
poll.fd = ClientSocket;
poll.events = POLLIN;
res = poll(&poll, 1, 1000); // 1000 ms timeout
if (res == 0)
{
// timeout
}
else if (res == -1)
{
// error
}
else
{
// implies (poll.revents & POLLIN) != 0
recv(ClientSocket, recvbuffer, recvbuflen,0); // we can read ...
}
The way I handle this problem is to never block inside recv() -- preferably by setting the socket to non-blocking mode, but you may also be able to get away with simply only calling recv() when you know the socket currently has some bytes available to read.
That leads to the next question: if you don't block inside recv(), how do you prevent CPU-spinning? The answer to that question is to call select() (or poll()) with the correct arguments so that you'll block there until the socket has bytes ready to recv().
Then comes the third question: if your thread is now blocked (possibly forever) inside select(), aren't we back to the original problem again? Not quite, because now we can implement a variation of the self-pipe trick. In particular, because select() (or poll()) can 'watch' multiple sockets at the same time, we can tell the call to block until either of two sockets has data ready-to-read. Then, when we want to shut down the thread, all the main thread has to do is send a single byte of data to the second socket, and that will cause select() to return immediately. When the thread sees that it is this second socket that is ready-for-read, it should respond by exiting, so that the main thread's blocking call to WaitForSingleObject(theThreadHandle) will return, and then the main thread can clean up without any risk of race conditions.
The final question is: how to set up a socket-pair so that your main thread can call send() on one of the pair's sockets, and your recv-thread will see the sent data appear on the other socket? Under POSIX it's easy, there is a socketpair() function that does exactly that. Under Windows, socketpair() does not exist, but you can roll your own implementation of it as shown here.

Resources