Haskell: getProcessStatus blocking SIGINT

Haskell: getProcessStatus blocking SIGINT - haskell

I'm trying to write a simple shell in Haskell, but I cant get the signal handling to work. If no command is running, sending SIGINT to the shell process triggers the signal handler. But when a blocking call to getProcessStatus is made, the signal is ignored. Sending a signal immediately to the child process of course kills the child and makes the blocking call return.
Replacing the blocking call with Control.Concurrent.threadDelay does not prevent the signal, i.e., everything works as intended. Replacing the blocking flag to getProcessStatus with False makes the function return before the child process has finished.
Reference to process package: https://hackage.haskell.org/package/unix-2.7.1.0/docs/System-Posix-Process.html#v:getProcessStatus
The relevant code is below, see the (only) commented line.
main :: IO ()
main = do
pidRef <- (newIORef [] :: IO (IORef [ProcessID]))
setSigHant pidRef
doPrompt pidRef
printPrompt :: IO ()
printPrompt = fdWrite stdError "λ➔ " >> return ()
doPrompt :: IORef [ProcessID] -> IO ()
doPrompt pidRef = do
printPrompt
tryLine <- try getLine :: IO (Either SomeException String)
case tryLine of
Left _ -> do
putStrLn ""
exitSuccess
Right line -> do
tryCl <- try (parse line) :: IO (Either SomeException [Command])
case tryCl of
Left e -> fdWrite stdError (show e ++ "\n") >> return ()
Right cl ->
if length cl > 0 && (cmd . head) cl == "cd" then
cd (head cl)
else do
execCommands pidRef cl (stdInput, stdOutput)
pids <- readIORef pidRef
-- This call to getProcessStatus blocks the signals
_ <- sequence $ map (getProcessStatus True False) pids
_ <- writeIORef pidRef []
return ()
doPrompt pidRef
setSigHant :: (IORef [ProcessID]) -> IO ()
setSigHant pidRef = do
let handler = Catch (sigIntHandler pidRef)
installHandler sigINT handler Nothing
return ()
sigIntHandler :: (IORef [ProcessID]) -> IO ()
sigIntHandler pidRef = do
pids <- readIORef pidRef
sequence_ $ map (signalProcess sigINT) pids
fdWrite stdError "\n"
printPrompt

getProcessStatus uses an interruptible FFI call internally. But why is -threaded necessary?
This blog post about handling ctrl-c in Haskell suggests that signal handling is done in a separate thread that kills the main thread using an asynchronous exception:
When the user hits Ctrl-C, GHC raises an async exception of type
UserInterrupt on the main thread. This happens because GHC installs an
interrupt handler which raises that exception, sending it to the main
thread with throwTo.
But the documentation for the async package mentions that:
Different Haskell implementations have different characteristics with
regard to which operations block all threads.
Using GHC without the -threaded option, all foreign calls will block
all other Haskell threads in the system, although I/O operations will
not. With the -threaded option, only foreign calls with the unsafe
attribute will block all other threads.
So maybe that's why proper handling of SIGINT in presence of interruptible ffi calls requires -threaded: otherwise, the thread that throws the asynchronous exception will be prevented from running.

Related

"thread blocked indefinitely in an STM transaction" in a case where threads are never blocked

I'm using the async library in conjunction with stm in my program.
The main thread forks two threads which run until one of them (it could be either one) encounters a solution. The solution is returned via a TMVar. Neither of them ever waits on any TMVar except to call putTMVar when the solution is found and one of them is guaranteed to run forever unless killed. So how could I possibly be getting "thread blocked indefinitely in an STM transaction" (which seems to happen approximately one in every twenty times) given that at least one of the child threads doesn't execute any blocking STM transactions (or die) until storing a result.
Note the two child threads communicate somewhat with each other using TVars, but not with TMVars.
Simplified code:
main :: IO ()
main = do
output <- newEmptyTMVar
result <- withAsync (child1 output) $ \_ -> withAsync (child2 output) $ \_ ->
let go = do
result <- atomically $ takeTMVar output
if someCondition result
then return result
else go
in go
print result
child1 :: TMVar Result -> IO ()
child1 output = go 0
where
go i = do
case computation1 i of
Nothing -> return ()
Just x -> atomically $ putTMVar x
go (i + 1)
child2 :: TMVar Result -> IO ()
-- Does some other stuff, but also only interacts with its argument to
-- give back a result, same as child1.

How to link parent async with multiple children asyncs

The documentation of the async package describes the withAsync function as:
Spawn an asynchronous action in a separate thread, and pass its Async
handle to the supplied function. When the function returns or throws
an exception, uninterruptibleCancel is called on the Async. This is a
useful variant of async that ensures an Async is never left running
unintentionally.
I've been staring at that for the past 2 hours and haven't been able to figure out how to start a monitor thread, that spawns multiple worker threads, such that:
If the monitor thread dies, all worker threads should be killed,
But, if any worker thread dies, none of the other worker threads should be affected. The monitor should be notified and it should be able to restart the worker thread.

It seems that we need two functions: one that starts all the async tasks and another that watches them and restarts them whenever they die.
The first one could be written like this:
withAsyncMany :: [IO t] -> ([Async t] -> IO b) -> IO b
withAsyncMany [] f = f []
withAsyncMany (t:ts) f = withAsync t $ \a -> withAsyncMany ts (f . (a:))
If we were using the managed package, we could also write it like this:
import Control.Monad.Managed (with,managed)
withAsyncMany' :: [IO t] -> ([Async t] -> IO b) -> IO b
withAsyncMany' = with . traverse (\t -> managed (withAsync t))
The restart function would loop the list of asyncs, polling for their status and renewing them when they fail:
{-# language NumDecimals #-}
import Control.Concurrent (threadDelay)
resurrect :: IO t -> [Async t] -> IO ()
resurrect restartAction = go []
where
go ts [] = do
threadDelay 1e6 -- wait a little before the next round of polling
go [] (reverse ts)
go past (a:pending) = do
status <- poll a -- has the task died, or finished?
case status of
Nothing -> go (a:past) pending
Just _ -> withAsync restartAction $ \a' -> go (a':past) pending
I'm worried however about the possibility of many nested withAsyncs causing some type of resource leak (because some kind of exception handler must be installed with each withAsync to notify the child in case the parent thread dies).
So perhaps in this case it would be better to spawn workers with plain asyncs, store the collection of Asyncs into some kind of mutable reference and install a single exception handler in the monitor thread, which would traverse the container terminating each task.

Here's another answer, that uses async instead of withAsync. The main function is
monitor :: Int -> IO () -> IO ()
monitor count task =
bracket (do asyncs <- replicateM count (async task)
newIORef asyncs)
(\ref -> forever (do
threadDelay 1e6
asyncs <- readIORef ref
vivify task (writeIORef ref) asyncs))
(\ref -> do
asyncs <- readIORef ref
mapM_ uninterruptibleCancel asyncs)
it uses an auxiliary vivify function that traverses a list of Asyncs, reviving dead ones and writing back the updated list to an IORef:
vivify :: IO () -> ([Async ()] -> IO ()) -> [Async ()] -> IO ()
vivify task write asyncs = go [] asyncs
where
go _ [] = do
return ()
go past (a:pending) = do
status <- poll a
case status of
Nothing -> do
go (a:past) pending
Just _ -> do
past' <- mask_ $ do
a' <- async task
write (reverse (a':past) ++ pending)
return (a':past)
go past' pending
We mask asynchronous exceptions in the interval between creating a new Async and "persisting" it in the IOref, because otherwise if an asynchronous exception arrived inbetween and killed the monitor thread, that Async would remain dangling.

In reactive-banana, is it safe to trigger handler actions from multiple threads?

Is it safe to trigger the fire action in
(addHandler, fire) <- newAddHandler
from a different thread from which the reactive-banana graph was compiled?

Yes, this is safe, but there is the caveat that #Cirdec mentioned.
For conreteness, consider the following example that creates an event network using the addHandler in a separate thread and then calls fire repeatedly in the main thread
import Control.Concurrent (myThreadId, threadDelay, forkIO)
main = do
...
(addHandler, fire) <- newAddHandler
let networkDescription :: MomentIO ()
networkDescription = do
e <- fromAddHandler addHandler
...
reactimate $ (print =<< myThreadId) <$ e -- reactimate
forkIO $ do
network <- compile networkDescription
actuate network
...
forever $ do -- event loop
threadDelay (10^6)
fire ()
(See the documentation "Terminating the program" in Control.Concurrent for why I've put the event loop in the main thread as opposed to putting the network in the main thread.)
In this and similar situations, the following will hold:
The IO actions executed by the reactimate will be run in the thread that calls fire, not in the thread where the network was compiled. This is what #Cirdec already mentioned.
If there were a second thread also calling fire, then it could potentially interleave with other calls to fire, i.e. the program could be calling fire twice concurrently. Then,
Reactive-banana uses a lock to ensure that Behaviors and Events are updated consistently. You can treat them as pure functions Time -> a and lists [(Time,a)] as usual.
However, the IO actions from the reactimates may interleave. In other words, the pure FRP part will stay pure, but the actual IO is subject to concurrency as usual.

Firing the fire handler itself is safe; it reads an IORef that is being updated atomically and runs each of the added handlers in the current thread. Whether or not that's safe will depend on what handlers have been added to the addHandler.
Using the addHandler in interpretAsHandler, fromAddHandler, or fromChanges should be safe. Nothing I know of in reactive-banana has any thread affinity, and even if it did, these are what newAddHandler was made for, so it should be safe anyway.
What you need to be careful of is the IO () actions executed by reactimate. If you need to reactimate IO actions that need to be run in a specific thread (for OpenGL output, etc), you need to only produce IO () actions that will send their data to that thread. In this complete OpenGL example for reactive-banana the IO () actions for OpenGL output, which have thread affinity, are run in the OpenGL thread. Instead ofreactimateing the Event (IO ()) executing them directly they are added to an IORef
whenIdleRef <- newIORef (return ())
let
addWhenIdle :: IO () -> IO ()
addWhenIdle y = atomicModifyIORef' whenIdleRef (\x -> (x >> y, ()))
runWhenIdle :: IO ()
runWhenIdle = atomicModifyIORef' whenIdleRef (\x -> (return (), x)) >>= id
let networkDescription :: forall t. Frameworks t => Moment t ()
networkDescription = do
reactimate $ fmap addWhenIdle (whenIdle outputs)
^ ^
| Event (IO ())
Stuff the event into an IORef
The IORef holding which IO () actions to run is read and each of all the actions are run in a context that I know is in the OpenGL thread.
idleCallback $= Just (do -- will be executed in the OpenGL thread when it's idle
getCurrentTime >>= raiseTime
runWhenIdle -- run those `IO ()` actions in this thread
postRedisplay Nothing)

How does one atomically use a set of handles?

Let's say you have a program with a bunch of threads. The one thread would like to freeze access to stdin, stdout, and stderr (causing any other threads or keyboards to block until its done) so that its output doesn't get interweaved with them. Is there a way to do this directly, or would there have to be a manager thread, you know, managin' the handle. Relatedly, could you cause any input on stdin to block any output on stdout until it received and handled (atomically)?

You can easily simulate a lock for controlling access to a resource with an MVar. You aquire the lock by taking the value with takeMVar and release the lock by replacing the value with putMVar. For example, we can define something like the following
import Control.Concurrent
import Control.Concurrent.MVar
main = do
stdinLock <- newMVar () -- create a new lock for stdin (unaquired)
let
printWithLabel a b = do
takeMVar stdinLock -- aquire the lock for stdin
putStrLn (show a ++ ":")
print b
putMVar stdinLock () -- release the lock for stdin
actions = map fork $ zipWith printWithLabel [1..26] ['A'..]
doneSignals <- sequence actions
sequence doneSignals
return ()
fork :: IO a -> IO (IO ())
fork a = do
done <- newEmptyMVar
forkIO (a >> putMVar done ())
return (takeMVar done)
We could extract the locking functionality into another function
withLock :: MVar () -> IO a -> IO a
withLock lock action = do
takeMVar lock
x <- action
putMVar lock ()
return x
withLock performs an IO action after acquiring a lock and releases it when were done. This doesn't properly handle what to do if the code throws exceptions and notably will not release the lock if an exception is thrown. The Lock in concurrent-extra provides a similar helper function which brackets an operation (handling exceptions) with acquiring and releasing a lock.
In terms of Lock and async the above example can be simplified to
import qualified Control.Concurrent.Lock as Lock
import Control.Concurrent.Async
main = do
stdinLock <- Lock.new
let
printWithLabel a b = Lock.with stdinLock $ do
putStrLn (show a ++ ":")
print b
actions = zipWith printWithLabel [1..26] ['A'..]
doneSignals <- mapM async actions
mapM_ wait doneSignals
If you want a thread reading input on stdin to block output from other threads to stdout you can use a single lock to control both stdin and stdout.

Catching Control-C exception in GHC (Haskell)

I built a really simple read-eval-print-loop in Haskell that catches Control-C (UserInterrupt). However, whenever I compile and run this program, it always catches the first Control-C and always aborts on the second Control-C with exit code 130. It doesn't matter how many lines of input I give it before and between the two Control-Cs, it always happens this way. I know I must be missing something simple... please help, thanks!
Note: this is with base-4 exceptions, so Control.Exception and not Control.OldException.
import Control.Exception as E
import System.IO
main :: IO ()
main = do hSetBuffering stdout NoBuffering
hSetBuffering stdin NoBuffering
repLoop
repLoop :: IO ()
repLoop
= do putStr "> "
line <- interruptible "<interrupted>" getLine
if line == "exit"
then putStrLn "goodbye"
else do putStrLn $ "input was: " ++ line
repLoop
interruptible :: a -> IO a -> IO a
interruptible a m
= E.handleJust f return m
where
f UserInterrupt
= Just a
f _
= Nothing

Wei Hu is correct; the Haskell runtime system deliberately aborts the program when a second control-C is pressed. To get the behavior one might expect:
import Control.Exception as E
import Control.Concurrent
import System.Posix.Signals
main = do
tid <- myThreadId
installHandler keyboardSignal (Catch (throwTo tid UserInterrupt)) Nothing
... -- rest of program

Disclaimer: I'm not familiar with GHC internals and my answer is based on grepping the source code, reading the comments, and making guesses.
The main function you define is in fact wrapped by runMainIO defined in GHC.TopHandler (this is further confirmed by looking at TcRnDriver.lhs):
-- | 'runMainIO' is wrapped around 'Main.main' (or whatever main is
-- called in the program). It catches otherwise uncaught exceptions,
-- and also flushes stdout\/stderr before exiting.
runMainIO :: IO a -> IO a
runMainIO main =
do
main_thread_id <- myThreadId
weak_tid <- mkWeakThreadId main_thread_id
install_interrupt_handler $ do
m <- deRefWeak weak_tid
case m of
Nothing -> return ()
Just tid -> throwTo tid (toException UserInterrupt)
a <- main
cleanUp
return a
`catch`
topHandler
And install_interrupt_handler is defined as:
install_interrupt_handler :: IO () -> IO ()
#ifdef mingw32_HOST_OS
install_interrupt_handler handler = do
_ <- GHC.ConsoleHandler.installHandler $
Catch $ \event ->
case event of
ControlC -> handler
Break -> handler
Close -> handler
_ -> return ()
return ()
#else
#include "rts/Signals.h"
-- specialised version of System.Posix.Signals.installHandler, which
-- isn't available here.
install_interrupt_handler handler = do
let sig = CONST_SIGINT :: CInt
_ <- setHandler sig (Just (const handler, toDyn handler))
_ <- stg_sig_install sig STG_SIG_RST nullPtr
-- STG_SIG_RST: the second ^C kills us for real, just in case the
-- RTS or program is unresponsive.
return ()
On Linux, stg_sig_install is a C function that calls out to sigaction. The parameter STG_SIG_RST is translated to SA_RESETHAND. On Windows, things are done differently, which probably explains ja's observation.

The most reliable solution for me (at least on Linux), has been to install a signal handler using System.Posix.Signals. I was hoping for a solution that would not require this, but the real reason I posted the question was that I wanted to know why GHC behaved the way it did. As explained on #haskell, a likely explanation is that GHC behaves this way so that the user can always Control-C an application if it hangs. Still, it would be nice if GHC provided a way to affect this behavior without the somewhat lower-level method that we resorted to :).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Haskell: getProcessStatus blocking SIGINT - haskell

Related

"thread blocked indefinitely in an STM transaction" in a case where threads are never blocked

How to link parent async with multiple children asyncs

In reactive-banana, is it safe to trigger handler actions from multiple threads?

How does one atomically use a set of handles?

Catching Control-C exception in GHC (Haskell)

Categories

Resources