Reentrant caching of "referentially transparent" IO calls - haskell

Assume we have an IO action such as
lookupStuff :: InputType -> IO OutputType
which could be something simple such as DNS lookup, or some web-service call against a time-invariant data.
Let's assume that:
The operation never throws any exception and/or never diverges
If it wasn't for the IO monad, the function would be pure, i.e. the result is always the same for equal input parameters
The action is reentrant, i.e. it can be called from multiple threads at the same time safely.
The lookupStuff operation is quite (time-)expensive.
The problem I'm facing is how to properly (and w/o using any unsafe*IO* cheat) implement a reentrant cache, that can be called from multiple threads, and coalesces multiple queries for the same input-parameters into a single request.
I guess I'm after something similiar as GHC's blackhole-concept for pure computations but in the IO "calculation" context.
What is the idiomatic Haskell/GHC solution for the stated problem?

Yeah, basically reimplement the logic. Although it seems similar to what GHC is already doing, that's GHC's choice. Haskell can be implemented on VMs that work very differently, so in that sense it isn't already done for you.
But yeah, just use an MVar (Map InputType OutputType) or even an IORef (Map InputType OutputType) (make sure to modify with atomicModifyIORef), and just store the cache in there. If this imperative solution seems wrong, it's the "if not for the IO, this function would be pure" constraint. If it were just an arbitrary IO action, then the idea that you would have to keep state in order to know what to execute or not seems perfectly natural. The problem is that Haskell does not have a type for "pure IO" (which, if it depends on a database, it is just behaving pure under certain conditions, which is not the same as being a hereditarily pure).
import qualified Data.Map as Map
import Control.Concurrent.MVar
-- takes an IO function and returns a cached version
cache :: (Ord a) => (a -> IO b) -> IO (a -> IO b)
cache f = do
r <- newMVar Map.empty
return $ \x -> do
cacheMap <- takeMVar r
case Map.lookup x cacheMap of
Just y -> do
putMVar r cacheMap
return y
Nothing -> do
y <- f x
putMVar (Map.insert x y cacheMap)
return y
Yeah it's ugly on the inside. But on the outside, look at that! It's just like the type of a pure memoization function, except for it has IO stained all over it.

Here's some code implementing more or less what I was after in my original question:
import Control.Concurrent
import Control.Exception
import Data.Either
import Data.Map (Map)
import qualified Data.Map as Map
import Prelude hiding (catch)
-- |Memoizing wrapper for 'IO' actions
memoizeIO :: Ord a => (a -> IO b) -> IO (a -> IO b)
memoizeIO action = do
cache <- newMVar Map.empty
return $ memolup cache action
where
-- Lookup helper
memolup :: Ord a => MVar (Map a (Async b)) -> (a -> IO b) -> a -> IO b
memolup cache action' args = wait' =<< modifyMVar cache lup
where
lup tab = case Map.lookup args tab of
Just ares' ->
return (tab, ares')
Nothing -> do
ares' <- async $ action' args
return (Map.insert args ares' tab, ares')
The code above builds upon Simon Marlow's Async abstraction as described in Tutorial: Parallel and Concurrent Programming in Haskell:
-- |Opaque type representing asynchronous results.
data Async a = Async ThreadId (MVar (Either SomeException a))
-- |Construct 'Async' result. Can be waited on with 'wait'.
async :: IO a -> IO (Async a)
async io = do
var <- newEmptyMVar
tid <- forkIO ((do r <- io; putMVar var (Right r))
`catch` \e -> putMVar var (Left e))
return $ Async tid var
-- |Extract value from asynchronous result. May block if result is not
-- available yet. Exceptions are returned as 'Left' values.
wait :: Async a -> IO (Either SomeException a)
wait (Async _ m) = readMVar m
-- |Version of 'wait' that raises exception.
wait' :: Async a -> IO a
wait' a = either throw return =<< wait a
-- |Cancels asynchronous computation if not yet completed (non-blocking).
cancel :: Async a -> IO ()
cancel (Async t _) = throwTo t ThreadKilled

Related

Convert IO callback to infinite list

I am using a library that I can provide with a function a -> IO (), which it will call occasionally.
Because the output of my function depends not only on the a it receives as input, but also on the previous a's, it would be much easier for me to write a function [a] -> IO (), where [a] is infinite.
Can I write a function:
magical :: ([a] -> IO ()) -> (a -> IO ())
That collects the a's it receives from the callback and passes them to my function as a lazy infinite list?
The IORef solution is indeed the simplest one. If you'd like to explore a pure (but more complex) variant, have a look at conduit. There are other implementations of the same concept, see Iteratee I/O, but I found myself conduit to be very easy to use.
A conduit (AKA pipe) is an abstraction of of program that can accept input and/or produce output. As such, it can keep internal state, if needed. In your case, magical would be a sink, that is, a conduit that accepts input of some type, but produces no output. By wiring it into a source, a program that produces output, you complete the pipeline and then ever time the sink asks for an input, the source is run until it produces its output.
In your case you'd have roughly something like
magical :: Sink a IO () -- consumes a stream of `a`s, no result
magical = go (some initial state)
where
go state = do
m'input <- await
case m'input of
Nothing -> return () -- finish
Just input -> do
-- do something with the input
go (some updated state)
This is not exactly what you asked for, but it might be enough for your purposes, I think.
magical :: ([a] -> IO ()) -> IO (a -> IO ())
magical f = do
list <- newIORef []
let g x = do
modifyIORef list (x:)
xs <- readIORef list
f xs -- or (reverse xs), if you need FIFO ordering
return g
So if you have a function fooHistory :: [a] -> IO (), you can use
main = do
...
foo <- magical fooHistory
setHandler foo -- here we have foo :: a -> IO ()
...
As #danidaz wrote above, you probably do not need magical, but can play the same trick directly in your fooHistory, modifying a list reference (IORef [a]).
main = do
...
list <- newIORef []
let fooHistory x = do
modifyIORef list (x:)
xs <- readIORef list
use xs -- or (reverse xs), if you need FIFO ordering
setHandler fooHistory -- here we have fooHistory :: a -> IO ()
...
Control.Concurrent.Chan does almost exactly what I wanted!
import Control.Monad (forever)
import Control.Concurrent (forkIO)
import Control.Concurrent.Chan
setHandler :: (Char -> IO ()) -> IO ()
setHandler f = void . forkIO . forever $ getChar >>= f
process :: String -> IO ()
process ('h':'i':xs) = putStrLn "hi" >> process xs
process ('a':xs) = putStrLn "a" >> process xs
process (x:xs) = process xs
process _ = error "Guaranteed to be infinite"
main :: IO ()
main = do
c <- newChan
setHandler $ writeChan c
list <- getChanContents c
process list
This seems like a flaw in the library design to me. You might consider an upstream patch so that you could provide something more versatile as input.

Write interval function for the StateT monad in Haste

So I wrote my own implementation of StateT because I couldn't get transformers to compile correctly in Haste. I think wanted to get the javascript setInterval working inside my state monad. Here is the ffi call to setInterval.
jsInterval :: Int -> IO () -> IO Int
jsInterval = ffi "(function(t,f){window.setInterval(f,t);})"
I couldn't think of anyway to get the result of m back after it is passed to jsInterval. So I tried to use IORefs.
interval :: Int -> StateT s IO () -> StateT s IO Int
interval i m = StateT $ \s -> do
ref <- newIORef Nothing
id_ <- jsInterval i $ do
(_, s') <- runStateT m s
writeIORef ref (Just s')
s' <- readIORef ref
return (id_, s')
This didn't work because it kept the original state. The read happened before the write. So I wrote a function that would poll in a loop until the IORef was written but this just hung forever.
interval :: Int -> StateT s IO () -> StateT s IO Int
interval i m = StateT $ \s -> do
ref <- newIORef Nothing
id_ <- jsInterval i $ do
(_, s') <- runStateT m s
writeIORef ref (Just s')
s' <- go ref
return (id_, s')
where
go ref = do
s <- readIORef ref
case s of
Nothing -> go ref
Just s' -> return s'
Is it possible to implement this function? I tried writing an instance of MonadEvent for StateT but that was also unsuccessful.
The IO action you are passing to your FFI'ed jsInterval is just a plain IO action. If you implement that action using runStateT you are just running a little 'local' StateT. It's unrelated to the enclosing code.
This is a generic problem with callbacks and monad stacks - callbacks (in the sense that the IO() parameter to jsInterval is a callback) have a fixed monad chosen in their definition and they have no way to generalise to other monadic effects you might be using elsewhere.
Since callbacks - in general - can be called at any time, including multiple times at once, in different threads, after the calling function has completed and its state has been destroyed - you can see that this is hard problem to solve in general.
The pragmatic answer is, as you have tried, to just use an IORef; create the IORef in the enclosing action and let the callback modify it. You can still write the callback in StateT style if you wish - just extract the state from the IORef and pass it to runStateT. Your code doesn't do this, you are just referencing the parameter s from the top-level : you need to use the IORef, something like this:
id_ <- jsInterval i $ do
current_s <- readIORef ref
(_, new_s) <- runStateT m current_s
writeIORef ref (new_s)
You can't really use Maybe unless you are prepared to teach the action m how to cope with a Maybe - it need to deal with the Nothing, so perhaps you want it to have the type StateT (Maybe s) IO () ?
A second logic problem (?) with your code is that certainly the s returned by interval will not have been changed yet - the setInterval code can't possibly have been triggered until javascript goes back into its idle loop.
The general problem of passing callbacks has been discussed a few times over the years, see:
https://mail.haskell.org/pipermail/haskell-cafe/2007-July/028501.html
http://andersk.mit.edu/haskell/monad-peel/
http://blog.sigfpe.com/2011/10/quick-and-dirty-reinversion-of-control.html
etc.

Abstraction for monadic recursion with "unless"

I'm trying to work out if it's possible to write an abstraction for the following situation. Suppose I have a type a with function a -> m Bool e.g. MVar Bool and readMVar. To abstract this concept out I create a newtype wrapper for the type and its function:
newtype MPredicate m a = MPredicate (a,a -> m Bool)
I can define a fairly simple operation like so:
doUnless :: (Monad m) => Predicate m a -> m () -> m ()
doUnless (MPredicate (a,mg)) g = mg a >>= \b -> unless b g
main = do
b <- newMVar False
let mpred = MPredicate (b,readMVar)
doUnless mpred (print "foo")
In this case doUnless would print "foo". Aside: I'm not sure whether a type class might be more appropriate to use instead of a newtype.
Now take the code below, which outputs an incrementing number then waits a second and repeats. It does this until it receives a "turn off" instruction via the MVar.
foobar :: MVar Bool -> IO ()
foobar mvb = foobar' 0
where
foobar' :: Int -> IO ()
foobar' x = readMVar mvb >>= \b -> unless b $ do
let x' = x + 1
print x'
threadDelay 1000000
foobar' x'
goTillEnter :: MVar Bool -> IO ()
goTillEnter mv = do
_ <- getLine
_ <- takeMVar mv
putMVar mv True
main = do
mvb <- newMVar False
forkIO $ foobar mvb
goTillEnter mvb
Is it possible to refactor foobar so that it uses MPredicate and doUnless?
Ignoring the actual implementation of foobar' I can think of a simplistic way of doing something similar:
cycleUnless :: x -> (x -> x) -> MPredicate m a -> m ()
cycleUnless x g mp = let g' x' = doUnless mp (g' $ g x')
in g' $ g x
Aside: I feel like fix could be used to make the above neater, though I still have trouble working out how to use it
But cycleUnless won't work on foobar because the type of foobar' is actually Int -> IO () (from the use of print x').
I'd also like to take this abstraction further, so that it can work threading around a Monad. With stateful Monads it becomes even harder. E.g.
-- EDIT: Updated the below to show an example of how the code is used
{- ^^ some parent function which has the MVar ^^ -}
cycleST :: (forall s. ST s (STArray s Int Int)) -> IO ()
cycleST sta = readMVar mvb >>= \b -> unless b $ do
n <- readMVar someMVar
i <- readMVar someOtherMVar
let sta' = do
arr <- sta
x <- readArray arr n
writeArray arr n (x + i)
return arr
y = runSTArray sta'
print y
cycleST sta'
I have something similar to the above working with RankNTypes. Now there's the additional problem of trying to thread through the existential s, which is not likely to type check if threaded around through an abstraction the likes of cycleUnless.
Additionally, this is simplified to make the question easier to answer. I also use a set of semaphores built from MVar [MVar ()] similar to the skip channel example in the MVar module. If I can solve the above problem I plan to generalize the semaphores as well.
Ultimately this isn't some blocking problem. I have 3 components of the application operating in a cycle off the same MVar Bool but doing fairly different asynchronous tasks. In each one I have written a custom function that performs the appropriate cycle.
I'm trying to learn the "don't write large programs" approach. What I'd like to do is refactor chunks of code into their own mini libraries so that I'm not building a large program but assembling lots of small ones. But so far this particular abstraction is escaping me.
Any thoughts on how I might go about this are very much appreciated!
You want to cleanly combine a stateful action having side effects, a delay, and an independent stopping condition.
The iterative monad transformer from the free package can be useful in these cases.
This monad transformer lets you describe a (possibly nonending) computation as a series of discrete steps. And what's better, it let's you interleave "stepped" computations using mplus. The combined computation stops when any of the individual computations stops.
Some preliminary imports:
import Data.Bool
import Control.Monad
import Control.Monad.Trans
import Control.Monad.Trans.Iter (delay,untilJust,IterT,retract,cutoff)
import Control.Concurrent
Your foobar function could be understood as a "sum" of three things:
A computation that does nothing but reading from the MVar at each step, and finishes when the Mvar is True.
untilTrue :: (MonadIO m) => MVar Bool -> IterT m ()
untilTrue = untilJust . liftM guard . liftIO . readMVar
An infinite computation that takes a delay at each step.
delays :: (MonadIO m) => Int -> IterT m a
delays = forever . delay . liftIO . threadDelay
An infinite computation that prints an increasing series of numbers.
foobar' :: (MonadIO m) => Int -> IterT m a
foobar' x = do
let x' = x + 1
liftIO (print x')
delay (foobar' x')
With this in place, we can write foobar as:
foobar :: (MonadIO m) => MVar Bool -> m ()
foobar v = retract (delays 1000000 `mplus` untilTrue v `mplus` foobar' 0)
The neat thing about this is that you can change or remove the "stopping condition" and the delay very easily.
Some clarifications:
The delay function is not a delay in IO, it just tells the iterative monad transformer to "put the argument in a separate step".
retract brings you back from the iterative monad transformer to the base monad. It's like saying "I don't care about the steps, just run the computation". You can combine retract with cutoff if you want to limit the maximum number of iterations.
untilJustconverts a value m (Maybe a) of the base monad into a IterT m a by retrying in each step until a Just is returned. Of course, this risks non-termination!
MPredicate is rather superfluous here; m Bool can be used instead. The monad-loops package contains plenty of control structures with m Bool conditions. whileM_ in particular is applicable here, although we need to include a State monad for the Int that we're threading around:
import Control.Monad.State
import Control.Monad.Loops
import Control.Applicative
foobar :: MVar Bool -> IO ()
foobar mvb = (`evalStateT` (0 :: Int)) $
whileM_ (not <$> lift (readMVar mvb)) $ do
modify (+1)
lift . print =<< get
lift $ threadDelay 1000000
Alternatively, we can use a monadic version of unless. For some reason monad-loops doesn't export such a function, so let's write it:
unlessM :: Monad m => m Bool -> m () -> m ()
unlessM mb action = do
b <- mb
unless b action
It's somewhat more convenient and more modular in a monadic setting, since we can always go from a pure Bool to m Bool, but not vice versa.
foobar :: MVar Bool -> IO ()
foobar mvb = go 0
where
go :: Int -> IO ()
go x = unlessM (readMVar mvb) $ do
let x' = x + 1
print x'
threadDelay 1000000
go x'
You mentioned fix; sometimes people indeed use it for ad-hoc monadic loops, for example:
printUntil0 :: IO ()
printUntil0 =
putStrLn "hello"
fix $ \loop -> do
n <- fmap read getLine :: IO Int
print n
when (n /= 0) loop
putStrLn "bye"
With some juggling it's possible to use fix with multi-argument functions. In the case of foobar:
foobar :: MVar Bool -> IO ()
foobar mvb = ($(0 :: Int)) $ fix $ \loop x -> do
unlessM (readMVar mvb) $ do
let x' = x + 1
print x'
threadDelay 1000000
loop x'
I'm not sure what's your MPredicate is doing.
First, instead of newtyping a tuple, it's probably better to use a normal algebric data type
data MPredicate a m = MPredicate a (a -> m Bool)
Second, the way you use it, MPredicate is equivalent to m Bool.
Haskell is lazzy, therefore there is no need to pass, a function and it's argument (even though
it's usefull with strict languages). Just pass the result, and the function will be called when needed.
I mean, instead of passing (x, f) around, just pass f x
Of course, if you are not trying to delay the evaluation and really need at some point, the argument or the function as well as the result, a tuple is fine.
Anyway, in the case your MPredicate is only there to delay the function evaluation, MPredicat reduces to m Bool and doUnless to unless.
Your first example is strictly equivalent :
main = do
b <- newMVar False
unless (readMVar b) (print "foo")
Now, if you want to loop a monad until a condition is reach (or equivalent) you should have a look at the monad-loop package. What you are looking it at is probably untilM_ or equivalent.

Haskell - Actor based mutability

I'm working on a haskell network application and I use the actor pattern to manage multithreading. One thing I came across is how to store for example a set of client sockets/handles. Which of course must be accessible for all threads and can change when clients log on/off.
Since I'm coming from the imperative world I thought about some kind of lock-mechanism but when I noticed how ugly this is I thought about "pure" mutability, well actually it's kind of pure:
import Control.Concurrent
import Control.Monad
import Network
import System.IO
import Data.List
import Data.Maybe
import System.Environment
import Control.Exception
newStorage :: (Eq a, Show a) => IO (Chan (String, Maybe (Chan [a]), Maybe a))
newStorage = do
q <- newChan
forkIO $ storage [] q
return q
newHandleStorage :: IO (Chan (String, Maybe (Chan [Handle]), Maybe Handle))
newHandleStorage = newStorage
storage :: (Eq a, Show a) => [a] -> Chan (String, Maybe (Chan [a]), Maybe a) -> IO ()
storage s q = do
let loop = (`storage` q)
(req, reply, d) <- readChan q
print ("processing " ++ show(d))
case req of
"add" -> loop ((fromJust d) : s)
"remove" -> loop (delete (fromJust d) s)
"get" -> do
writeChan (fromJust reply) s
loop s
store s d = writeChan s ("add", Nothing, Just d)
unstore s d = writeChan s ("remove", Nothing, Just d)
request s = do
chan <- newChan
writeChan s ("get", Just chan, Nothing)
readChan chan
The point is that a thread (actor) is managing a list of items and modifies the list according to incoming requests. Since thread are really cheap I thought this could be a really nice functional alternative.
Of course this is just a prototype (a quick dirty proof of concept).
So my question is:
Is this a "good" way of managing shared mutable variables (in the actor world) ?
Is there already a library for this pattern ? (I already searched but I found nothing)
Regards,
Chris
Here is a quick and dirty example using stm and pipes-network. This will set up a simple server that allows clients to connect and increment or decrement a counter. It will display a very simple status bar showing the current tallies of all connected clients and will remove client tallies from the bar when they disconnect.
First I will begin with the server, and I've generously commented the code to explain how it works:
import Control.Concurrent.STM (STM, atomically)
import Control.Concurrent.STM.TVar
import qualified Data.HashMap.Strict as H
import Data.Foldable (forM_)
import Control.Concurrent (forkIO, threadDelay)
import Control.Monad (unless)
import Control.Monad.Trans.State.Strict
import qualified Data.ByteString.Char8 as B
import Control.Proxy
import Control.Proxy.TCP
import System.IO
main = do
hSetBuffering stdout NoBuffering
{- These are the internal data structures. They should be an implementation
detail and you should never expose these references to the
"business logic" part of the application. -}
-- I use nRef to keep track of creating fresh Ints (which identify users)
nRef <- newTVarIO 0 :: IO (TVar Int)
{- hMap associates every user (i.e. Int) with a counter
Notice how I've "striped" the hash map by storing STM references to the
values instead of storing the values directly. This means that I only
actually write the hashmap when adding or removing users, which reduces
contention for the hash map.
Since each user gets their own unique STM reference for their counter,
modifying counters does not cause contention with other counters or
contention with the hash map. -}
hMap <- newTVarIO H.empty :: IO (TVar (H.HashMap Int (TVar Int)))
{- The following code makes heavy use of Haskell's pure closures. Each
'let' binding closes over its current environment, which is safe since
Haskell is pure. -}
let {- 'getCounters' is the only server-facing command in our STM API. The
only permitted operation is retrieving the current set of user
counters.
'getCounters' closes over the 'hMap' reference currently in scope so
that the server never needs to be aware about our internal
implementation. -}
getCounters :: STM [Int]
getCounters = do
refs <- fmap H.elems (readTVar hMap)
mapM readTVar refs
{- 'init' is the only client-facing command in our STM API. It
initializes the client's entry in the hash map and returns two
commands: the first command is what the client calls to 'increment'
their counter and the second command is what the client calls to log
off and delete
'delete' command.
Notice that those two returned commands each close over the client's
unique STM reference so the client never needs to be aware of how
exactly 'init' is implemented under the hood. -}
init :: STM (STM (), STM ())
init = do
n <- readTVar nRef
writeTVar nRef $! n + 1
ref <- newTVar 0
modifyTVar' hMap (H.insert n ref)
let incrementRef :: STM ()
incrementRef = do
mRef <- fmap (H.lookup n) (readTVar hMap)
forM_ mRef $ \ref -> modifyTVar' ref (+ 1)
deleteRef :: STM ()
deleteRef = modifyTVar' hMap (H.delete n)
return (incrementRef, deleteRef)
{- Now for the actual program logic. Everything past this point only uses
the approved STM API (i.e. 'getCounters' and 'init'). If I wanted I
could factor the above approved STM API into a separate module to enforce
the encapsulation boundary, but I am lazy. -}
{- Fork a thread which polls the current state of the counters and displays
it to the console. There is a way to implement this without polling but
this gets the job done for now.
Most of what it is doing is just some simple tricks to reuse the same
console line instead of outputting a stream of lines. Otherwise it
would be just:
forkIO $ forever $ do
ns <- atomically getCounters
print ns
-}
forkIO $ (`evalStateT` 0) $ forever $ do
del <- get
lift $ do
putStr (replicate del '\b')
putStr (replicate del ' ' )
putStr (replicate del '\b')
ns <- lift $ atomically getCounters
let str = show ns
lift $ putStr str
put $! length str
lift $ threadDelay 10000
{- Fork a thread for each incoming connection, which listens to the client's
commands and translates them into 'STM' actions -}
serve HostAny "8080" $ \(socket, _) -> do
(increment, delete) <- atomically init
{- Right now, just do the dumb thing and convert all keypresses into
increment commands, with the exception of the 'q' key, which will
quit -}
let handler :: (Proxy p) => () -> Consumer p Char IO ()
handler () = runIdentityP loop
where
loop = do
c <- request ()
unless (c == 'q') $ do
lift $ atomically increment
loop
{- This uses my 'pipes' library. It basically is a high-level way to
say:
* Read binary packets from the socket no bigger than 4096 bytes
* Get the first character from each packet and discard the rest
* Handle the character using the above 'handler' function -}
runProxy $ socketReadS 4096 socket >-> mapD B.head >-> handler
{- The above pipeline finishes either when the socket closes or
'handler' stops looping because it received a 'q'. Either case means
that the client is done so we log them out using 'delete'. -}
atomically delete
Next up is the client, which simply opens a connections and forwards all key presses as single packets:
import Control.Monad
import Control.Proxy
import Control.Proxy.Safe
import Control.Proxy.TCP.Safe
import Data.ByteString.Char8 (pack)
import System.IO
main = do
hSetBuffering stdin NoBuffering
hSetEcho stdin False
{- Again, this uses my 'pipes' library. It basically says:
* Read characters from the console using 'commands'
* Pack them into a binary format
* send them to a server running at 127.0.0.1:8080
This finishes looping when the user types a 'q' or the connection is
closed for whatever reason.
-}
runSafeIO $ runProxy $ runEitherK $
try . commands
>-> mapD (\c -> pack [c])
>-> connectWriteD Nothing "127.0.0.1" "8080"
commands :: (Proxy p) => () -> Producer p Char IO ()
commands () = runIdentityP loop
where
loop = do
c <- lift getChar
respond c
unless (c == 'q') loop
It's pretty simple: commands generates a stream of Chars, which then get converted to ByteStrings and then sent as packets to the server.
If you run the server and a few clients and have them each type in a few keys, your server display will output a list showing how many keys each client typed:
[1,6,4]
... and if some of the clients disconnect they will be removed from the list:
[1,4]
Note that the pipes component of these examples will simplify greatly in the upcoming pipes-4.0.0 release, but the current pipes ecosystem still gets the job done as is.
First, I'd definitely recommend using your own specific data type for representing commands. When using (String, Maybe (Chan [a]), Maybe a) a buggy client can crash your actor simply by sending an unknown command or by sending ("add", Nothing, Nothing), etc. I'd suggest something like
data Command a = Add a | Remove a | Get (Chan [a])
Then you can pattern match on commands in storage in a save way.
Actors have their advantages, but also I feel that they have some drawbacks. For example, getting an answer from an actor requires sending it a command and then waiting for a reply. And the client can't be completely sure that it gets a reply and that the reply will be of some specific type - you can't say I want only answers of this type (and how many of them) for this particular command.
So as an example I'll give a simple, STM solution. It'd be better to use a hash table or a (balanced tree) set, but since Handle implements neither Ord nor Hashable, we can't use these data structures, so I'll keep using lists.
module ThreadSet (
TSet, add, remove, get
) where
import Control.Monad
import Control.Monad.STM
import Control.Concurrent.STM.TVar
import Data.List (delete)
newtype TSet a = TSet (TVar [a])
add :: (Eq a) => a -> TSet a -> STM ()
add x (TSet v) = readTVar v >>= writeTVar v . (x :)
remove :: (Eq a) => a -> TSet a -> STM ()
remove x (TSet v) = readTVar v >>= writeTVar v . delete x
get :: (Eq a) => TSet a -> STM [a]
get (TSet v) = readTVar v
This module implements a STM based set of arbitrary elements. You can have multiple such sets and use them together in a single STM transaction that succeeds or fails at once. For example
-- | Ensures that there is exactly one element `x` in the set.
add1 :: (Eq a) => a -> TSet a -> STM ()
add1 x v = remove x v >> add x v
This would be difficult with actors, you'd have to add it as another command for the actor, you can't compose it of existing actions and still have atomicity.
Update: There is an interesting article explaining why Clojure designers chose not to use actors. For example, using actors, even if you have many reads and only very little writes to a mutable structure, they're all serialized, which can greatly impact performance.

Storing arbitrary function calls across threads

I'm trying to write a library aiming to reproduce Qt's threading semantics: signals can be connected to slots, and all slots execute in a known thread, so that slots tied to the same thread are threadsafe with regards to each other.
I have the following API:
data Signal a = Signal Unique a
data Slot a = Slot Unique ThreadId (a -> IO ())
mkSignal :: IO (Signal a)
mkSlot :: ThreadId -> (Slot a -> a -> IO ()) -> IO (Slot a)
connect :: Signal a -> Slot a -> IO ()
-- callable from any thread
emit :: Signal a -> a -> IO ()
-- runs in Slot's thread as a result of `emit`
execute :: Slot a -> a -> IO ()
execute (Slot _ _ f) arg = f arg
The problem is getting from emit to execute. The argument needs to be stored at runtime somehow, and then an IO action performed, but I can't seem to get past the type checker.
The things I need:
Type safety: signals shouldn't be connected to slots expecting a different type.
Type-independence: there can be more than one slots for any given type (Perhaps this can be relaxed with newtype and/or TH).
Ease of use: since this is a library, signals and slots should be easy to create.
The things I've tried:
Data.Dynamic: makes the whole thing really fragile, and I haven't found a way to perform a correctly-typed IO action on a Dynamic. There's dynApply, but it's pure.
Existential types: I need to execute the function passed to mkSlot, as opposed to an arbitrary function based on the type.
Data.HList: I'm not smart enough to figure it out.
What am I missing?
Firstly, are you sure Slots really want to execute in a specific thread? It's easy to write thread-safe code in Haskell, and threads are very lightweight in GHC, so you're not gaining much by tying all event-handler execution to a specific Haskell thread.
Also, mkSlot's callback doesn't need to be given the Slot itself: you can use recursive do-notation to bind the slot in its callback without adding the concern of tying the knot to mkSlot.
Anyway, you don't need anything as complicated as those solutions. I expect when you talk about existential types, you're thinking about sending something like (a -> IO (), a) through a TChan (which you mentioned using in the comments) and applying it on the other end, but you want the TChan to accept values of this type for any a, rather than just one specific a. The key insight here is that if you have (a -> IO (), a) and don't know what a is, the only thing you can do is apply the function to the value, giving you an IO () — so we can just send those through the channel instead!
Here's an example:
import Data.Unique
import Control.Applicative
import Control.Monad
import Control.Concurrent
import Control.Concurrent.STM
newtype SlotGroup = SlotGroup (IO () -> IO ())
data Signal a = Signal Unique (TVar [Slot a])
data Slot a = Slot Unique SlotGroup (a -> IO ())
-- When executed, this produces a function taking an IO action and returning
-- an IO action that writes that action to the internal TChan. The advantage
-- of this approach is that it's impossible for clients of newSlotGroup to
-- misuse the internals by reading the TChan or similar, and the interface is
-- kept abstract.
newSlotGroup :: IO SlotGroup
newSlotGroup = do
chan <- newTChanIO
_ <- forkIO . forever . join . atomically . readTChan $ chan
return $ SlotGroup (atomically . writeTChan chan)
mkSignal :: IO (Signal a)
mkSignal = Signal <$> newUnique <*> newTVarIO []
mkSlot :: SlotGroup -> (a -> IO ()) -> IO (Slot a)
mkSlot group f = Slot <$> newUnique <*> pure group <*> pure f
connect :: Signal a -> Slot a -> IO ()
connect (Signal _ v) slot = atomically $ do
slots <- readTVar v
writeTVar v (slot:slots)
emit :: Signal a -> a -> IO ()
emit (Signal _ v) a = atomically (readTVar v) >>= mapM_ (`execute` a)
execute :: Slot a -> a -> IO ()
execute (Slot _ (SlotGroup send) f) a = send (f a)
This uses a TChan to send actions to the worker thread each slot is tied to.
Note that I'm not very familiar with Qt, so I may have missed some subtlety of the model. You can also disconnect Slots with this:
disconnect :: Signal a -> Slot a -> IO ()
disconnect (Signal _ v) (Slot u _ _) = atomically $ do
slots <- readTVar v
writeTVar v $ filter keep slots
where keep (Slot u' _) = u' /= u
You might want something like Map Unique (Slot a) instead of [Slot a] if this is likely to be a bottleneck.
So, the solution here is to (a) recognise that you have something that's fundamentally based upon mutable state, and use a mutable variable to structure it; (b) realise that functions and IO actions are first-class just like everything else, so you don't have to do anything special to construct them at runtime :)
By the way, I suggest keeping the implementations of Signal and Slot abstract by not exporting their constructors from the module defining them; there are many ways to tackle this approach without changing the API, after all.

Resources