How to link parent async with multiple children asyncs - multithreading

The documentation of the async package describes the withAsync function as:
Spawn an asynchronous action in a separate thread, and pass its Async
handle to the supplied function. When the function returns or throws
an exception, uninterruptibleCancel is called on the Async. This is a
useful variant of async that ensures an Async is never left running
unintentionally.
I've been staring at that for the past 2 hours and haven't been able to figure out how to start a monitor thread, that spawns multiple worker threads, such that:
If the monitor thread dies, all worker threads should be killed,
But, if any worker thread dies, none of the other worker threads should be affected. The monitor should be notified and it should be able to restart the worker thread.

It seems that we need two functions: one that starts all the async tasks and another that watches them and restarts them whenever they die.
The first one could be written like this:
withAsyncMany :: [IO t] -> ([Async t] -> IO b) -> IO b
withAsyncMany [] f = f []
withAsyncMany (t:ts) f = withAsync t $ \a -> withAsyncMany ts (f . (a:))
If we were using the managed package, we could also write it like this:
import Control.Monad.Managed (with,managed)
withAsyncMany' :: [IO t] -> ([Async t] -> IO b) -> IO b
withAsyncMany' = with . traverse (\t -> managed (withAsync t))
The restart function would loop the list of asyncs, polling for their status and renewing them when they fail:
{-# language NumDecimals #-}
import Control.Concurrent (threadDelay)
resurrect :: IO t -> [Async t] -> IO ()
resurrect restartAction = go []
where
go ts [] = do
threadDelay 1e6 -- wait a little before the next round of polling
go [] (reverse ts)
go past (a:pending) = do
status <- poll a -- has the task died, or finished?
case status of
Nothing -> go (a:past) pending
Just _ -> withAsync restartAction $ \a' -> go (a':past) pending
I'm worried however about the possibility of many nested withAsyncs causing some type of resource leak (because some kind of exception handler must be installed with each withAsync to notify the child in case the parent thread dies).
So perhaps in this case it would be better to spawn workers with plain asyncs, store the collection of Asyncs into some kind of mutable reference and install a single exception handler in the monitor thread, which would traverse the container terminating each task.

Here's another answer, that uses async instead of withAsync. The main function is
monitor :: Int -> IO () -> IO ()
monitor count task =
bracket (do asyncs <- replicateM count (async task)
newIORef asyncs)
(\ref -> forever (do
threadDelay 1e6
asyncs <- readIORef ref
vivify task (writeIORef ref) asyncs))
(\ref -> do
asyncs <- readIORef ref
mapM_ uninterruptibleCancel asyncs)
it uses an auxiliary vivify function that traverses a list of Asyncs, reviving dead ones and writing back the updated list to an IORef:
vivify :: IO () -> ([Async ()] -> IO ()) -> [Async ()] -> IO ()
vivify task write asyncs = go [] asyncs
where
go _ [] = do
return ()
go past (a:pending) = do
status <- poll a
case status of
Nothing -> do
go (a:past) pending
Just _ -> do
past' <- mask_ $ do
a' <- async task
write (reverse (a':past) ++ pending)
return (a':past)
go past' pending
We mask asynchronous exceptions in the interval between creating a new Async and "persisting" it in the IOref, because otherwise if an asynchronous exception arrived inbetween and killed the monitor thread, that Async would remain dangling.

Related

Waiting on multiple async's reliably?

My code needs to fire multiple threads and keep track of which have finished and which are still running. I as planning on using waitAny or waitAnyCatch, but was thrown off by the following in the documentation
If multiple Asyncs complete or have completed, then the value returned corresponds to the first completed Async in the list.
If that is really the case, how does one ever keep track of running / exited threads reliably?
Here's my simplified code:
chan <- newChan
currentThreadsRef <- newIORef []
-- read jobs from a channel, and run them in parallel asyncs/threads,
-- while adding all threads references to currentThreadsRef
async $ do
jobArgs <- readChan chan
jobAsync <- async $ runJob jobArgs
atomicallyModifyIORef' currentThreadsRef $ \x -> (jobAsync:x, ())
-- wait for jobs to be finished, and remove the thread refernece
-- from currentThreadsRef
waitForAllJobs currentJobsRef = do
(readIORef currentJobsRef) >>= \case
[] -> logDebug "All jobs exited"
currentJobs -> do
(exitedJob, jobResult) <- waitAnyCatch currentJobs
atomicallyModifyIORef currentJobsRef $ \x -> (filter (/= exitedjob) x, ())
logDebug $ "Job completed with result=" <> show result
waitForAllJobs currentJobsRef
PS: Although it may not be obvious from my simplified code above, there is a reason why I cannot simply use mapConcurrently over the input-data. Actually, async-pool seems like a good fit for my use-case, but even that has the same problem with waitAny.
Here's a program that launches 1000 asyncs all set to terminate within a second and waits for them all in a loop. Compiled with ghc -O2 -threaded and run with +RTS -N, it runs in about 1.5 seconds, and none of the asyncs gets "lost":
import Control.Concurrent
import Control.Concurrent.Async
import qualified Data.Set as Set
main :: IO ()
main = do
let n = 1000 :: Int
asyncs0 <- mapM (\i -> async (threadDelay 1000000 >> return i)) [1..n]
let loop :: Set.Set (Async Int) -> IO ()
loop asyncs | null asyncs = return ()
| otherwise = do
(a, _i) <- waitAny (Set.toList asyncs)
loop (Set.delete a asyncs)
loop (Set.fromList asyncs0)
So, as was mentioned in a comment, the documentation is referring to the fact that the first completed async in the provided list is the one that will be "returned", but if multiple asyncs have completed, the additional ones aren't "forgotten". You just need to remove the returned async from the list and re-poll, and you'll eventually get them all.
So, you shouldn't have any trouble waiting on multiple asyncs with waitAny.

"thread blocked indefinitely in an STM transaction" in a case where threads are never blocked

I'm using the async library in conjunction with stm in my program.
The main thread forks two threads which run until one of them (it could be either one) encounters a solution. The solution is returned via a TMVar. Neither of them ever waits on any TMVar except to call putTMVar when the solution is found and one of them is guaranteed to run forever unless killed. So how could I possibly be getting "thread blocked indefinitely in an STM transaction" (which seems to happen approximately one in every twenty times) given that at least one of the child threads doesn't execute any blocking STM transactions (or die) until storing a result.
Note the two child threads communicate somewhat with each other using TVars, but not with TMVars.
Simplified code:
main :: IO ()
main = do
output <- newEmptyTMVar
result <- withAsync (child1 output) $ \_ -> withAsync (child2 output) $ \_ ->
let go = do
result <- atomically $ takeTMVar output
if someCondition result
then return result
else go
in go
print result
child1 :: TMVar Result -> IO ()
child1 output = go 0
where
go i = do
case computation1 i of
Nothing -> return ()
Just x -> atomically $ putTMVar x
go (i + 1)
child2 :: TMVar Result -> IO ()
-- Does some other stuff, but also only interacts with its argument to
-- give back a result, same as child1.

Why does my parallel traversal Haskell program leak memory?

Consider the following Haskell program (I'm doing this mostly for learning purposes):
import qualified Control.Concurrent.MSem as Sem
import System.Environment (getArgs)
import Control.Concurrent (forkIO)
import Control.Monad
-- Traverse with maximum n threads
parallelTraverse :: Foldable a => Int -> (b -> IO()) -> a b -> IO ()
parallelTraverse n action values = do
sem <- Sem.new n
forM_ values $ \value -> Sem.with sem (forkIO $ action value)
main :: IO ()
main = do
args <- getArgs
let nThreads = read . head $ args :: Int
parallelTraverse nThreads print [(1::Int)..]
when I run it, memory quickly climbs to several GB. I tried various combinations to make sure I discard the results of intermediate computations (the print actions). Why is it still leaking space?
First of all, you have an evident mistake in the following piece:
Sem.with sem (forkIO $ action value)
You're addressing the semaphore from the master thread around the "fork" operation instead of the action there. Following is the proper way to implement it:
forkIO (Sem.with sem (action value))
I.e., to address the semaphore from the context of the forked thread.
Secondly, in the following code you're calling the parallelTraverse operation on an infinite list:
parallelTraverse nThreads print [(1::Int)..]
Which results in the infinite forking of threads. And since the forkIO operation is roughly instantaneous for the calling thread, it's pretty much no surprise that you're running out of resources quite soon.
To use the semaphore to limit the number of worker threads the with pattern simply won't do in your case. Instead you should use the explicit combination of wait and signal and not forget to treat the exceptions properly (in case you expect them). E.g.,:
parallelTraverse :: Foldable a => Int -> (b -> IO()) -> a b -> IO ()
parallelTraverse n action values = do
sem <- Sem.new n
forM_ values $ \value -> do
Sem.wait sem
forkIO $ finally (action value) (Sem.signal sem)

Storing arbitrary function calls across threads

I'm trying to write a library aiming to reproduce Qt's threading semantics: signals can be connected to slots, and all slots execute in a known thread, so that slots tied to the same thread are threadsafe with regards to each other.
I have the following API:
data Signal a = Signal Unique a
data Slot a = Slot Unique ThreadId (a -> IO ())
mkSignal :: IO (Signal a)
mkSlot :: ThreadId -> (Slot a -> a -> IO ()) -> IO (Slot a)
connect :: Signal a -> Slot a -> IO ()
-- callable from any thread
emit :: Signal a -> a -> IO ()
-- runs in Slot's thread as a result of `emit`
execute :: Slot a -> a -> IO ()
execute (Slot _ _ f) arg = f arg
The problem is getting from emit to execute. The argument needs to be stored at runtime somehow, and then an IO action performed, but I can't seem to get past the type checker.
The things I need:
Type safety: signals shouldn't be connected to slots expecting a different type.
Type-independence: there can be more than one slots for any given type (Perhaps this can be relaxed with newtype and/or TH).
Ease of use: since this is a library, signals and slots should be easy to create.
The things I've tried:
Data.Dynamic: makes the whole thing really fragile, and I haven't found a way to perform a correctly-typed IO action on a Dynamic. There's dynApply, but it's pure.
Existential types: I need to execute the function passed to mkSlot, as opposed to an arbitrary function based on the type.
Data.HList: I'm not smart enough to figure it out.
What am I missing?
Firstly, are you sure Slots really want to execute in a specific thread? It's easy to write thread-safe code in Haskell, and threads are very lightweight in GHC, so you're not gaining much by tying all event-handler execution to a specific Haskell thread.
Also, mkSlot's callback doesn't need to be given the Slot itself: you can use recursive do-notation to bind the slot in its callback without adding the concern of tying the knot to mkSlot.
Anyway, you don't need anything as complicated as those solutions. I expect when you talk about existential types, you're thinking about sending something like (a -> IO (), a) through a TChan (which you mentioned using in the comments) and applying it on the other end, but you want the TChan to accept values of this type for any a, rather than just one specific a. The key insight here is that if you have (a -> IO (), a) and don't know what a is, the only thing you can do is apply the function to the value, giving you an IO () — so we can just send those through the channel instead!
Here's an example:
import Data.Unique
import Control.Applicative
import Control.Monad
import Control.Concurrent
import Control.Concurrent.STM
newtype SlotGroup = SlotGroup (IO () -> IO ())
data Signal a = Signal Unique (TVar [Slot a])
data Slot a = Slot Unique SlotGroup (a -> IO ())
-- When executed, this produces a function taking an IO action and returning
-- an IO action that writes that action to the internal TChan. The advantage
-- of this approach is that it's impossible for clients of newSlotGroup to
-- misuse the internals by reading the TChan or similar, and the interface is
-- kept abstract.
newSlotGroup :: IO SlotGroup
newSlotGroup = do
chan <- newTChanIO
_ <- forkIO . forever . join . atomically . readTChan $ chan
return $ SlotGroup (atomically . writeTChan chan)
mkSignal :: IO (Signal a)
mkSignal = Signal <$> newUnique <*> newTVarIO []
mkSlot :: SlotGroup -> (a -> IO ()) -> IO (Slot a)
mkSlot group f = Slot <$> newUnique <*> pure group <*> pure f
connect :: Signal a -> Slot a -> IO ()
connect (Signal _ v) slot = atomically $ do
slots <- readTVar v
writeTVar v (slot:slots)
emit :: Signal a -> a -> IO ()
emit (Signal _ v) a = atomically (readTVar v) >>= mapM_ (`execute` a)
execute :: Slot a -> a -> IO ()
execute (Slot _ (SlotGroup send) f) a = send (f a)
This uses a TChan to send actions to the worker thread each slot is tied to.
Note that I'm not very familiar with Qt, so I may have missed some subtlety of the model. You can also disconnect Slots with this:
disconnect :: Signal a -> Slot a -> IO ()
disconnect (Signal _ v) (Slot u _ _) = atomically $ do
slots <- readTVar v
writeTVar v $ filter keep slots
where keep (Slot u' _) = u' /= u
You might want something like Map Unique (Slot a) instead of [Slot a] if this is likely to be a bottleneck.
So, the solution here is to (a) recognise that you have something that's fundamentally based upon mutable state, and use a mutable variable to structure it; (b) realise that functions and IO actions are first-class just like everything else, so you don't have to do anything special to construct them at runtime :)
By the way, I suggest keeping the implementations of Signal and Slot abstract by not exporting their constructors from the module defining them; there are many ways to tackle this approach without changing the API, after all.

Reentrant caching of "referentially transparent" IO calls

Assume we have an IO action such as
lookupStuff :: InputType -> IO OutputType
which could be something simple such as DNS lookup, or some web-service call against a time-invariant data.
Let's assume that:
The operation never throws any exception and/or never diverges
If it wasn't for the IO monad, the function would be pure, i.e. the result is always the same for equal input parameters
The action is reentrant, i.e. it can be called from multiple threads at the same time safely.
The lookupStuff operation is quite (time-)expensive.
The problem I'm facing is how to properly (and w/o using any unsafe*IO* cheat) implement a reentrant cache, that can be called from multiple threads, and coalesces multiple queries for the same input-parameters into a single request.
I guess I'm after something similiar as GHC's blackhole-concept for pure computations but in the IO "calculation" context.
What is the idiomatic Haskell/GHC solution for the stated problem?
Yeah, basically reimplement the logic. Although it seems similar to what GHC is already doing, that's GHC's choice. Haskell can be implemented on VMs that work very differently, so in that sense it isn't already done for you.
But yeah, just use an MVar (Map InputType OutputType) or even an IORef (Map InputType OutputType) (make sure to modify with atomicModifyIORef), and just store the cache in there. If this imperative solution seems wrong, it's the "if not for the IO, this function would be pure" constraint. If it were just an arbitrary IO action, then the idea that you would have to keep state in order to know what to execute or not seems perfectly natural. The problem is that Haskell does not have a type for "pure IO" (which, if it depends on a database, it is just behaving pure under certain conditions, which is not the same as being a hereditarily pure).
import qualified Data.Map as Map
import Control.Concurrent.MVar
-- takes an IO function and returns a cached version
cache :: (Ord a) => (a -> IO b) -> IO (a -> IO b)
cache f = do
r <- newMVar Map.empty
return $ \x -> do
cacheMap <- takeMVar r
case Map.lookup x cacheMap of
Just y -> do
putMVar r cacheMap
return y
Nothing -> do
y <- f x
putMVar (Map.insert x y cacheMap)
return y
Yeah it's ugly on the inside. But on the outside, look at that! It's just like the type of a pure memoization function, except for it has IO stained all over it.
Here's some code implementing more or less what I was after in my original question:
import Control.Concurrent
import Control.Exception
import Data.Either
import Data.Map (Map)
import qualified Data.Map as Map
import Prelude hiding (catch)
-- |Memoizing wrapper for 'IO' actions
memoizeIO :: Ord a => (a -> IO b) -> IO (a -> IO b)
memoizeIO action = do
cache <- newMVar Map.empty
return $ memolup cache action
where
-- Lookup helper
memolup :: Ord a => MVar (Map a (Async b)) -> (a -> IO b) -> a -> IO b
memolup cache action' args = wait' =<< modifyMVar cache lup
where
lup tab = case Map.lookup args tab of
Just ares' ->
return (tab, ares')
Nothing -> do
ares' <- async $ action' args
return (Map.insert args ares' tab, ares')
The code above builds upon Simon Marlow's Async abstraction as described in Tutorial: Parallel and Concurrent Programming in Haskell:
-- |Opaque type representing asynchronous results.
data Async a = Async ThreadId (MVar (Either SomeException a))
-- |Construct 'Async' result. Can be waited on with 'wait'.
async :: IO a -> IO (Async a)
async io = do
var <- newEmptyMVar
tid <- forkIO ((do r <- io; putMVar var (Right r))
`catch` \e -> putMVar var (Left e))
return $ Async tid var
-- |Extract value from asynchronous result. May block if result is not
-- available yet. Exceptions are returned as 'Left' values.
wait :: Async a -> IO (Either SomeException a)
wait (Async _ m) = readMVar m
-- |Version of 'wait' that raises exception.
wait' :: Async a -> IO a
wait' a = either throw return =<< wait a
-- |Cancels asynchronous computation if not yet completed (non-blocking).
cancel :: Async a -> IO ()
cancel (Async t _) = throwTo t ThreadKilled

Resources