Is it possible to run several instances of hint in parallel? - haskell

Is there any way to start two hint interpreters and at runtime & subsequently assign smaller computations to either one or the other? When I invoke hint for a small expression (e.g. typed into a website) then, - without reliable testing -, it seems to me as if the time to start/load hint is approximately one second. If the instance is already started that second would be shaved.
The hint seems to have no function where I can start it and keep it nicely pending for later use.
(Auto)Plugins would be a further option of course but I think that is more suitable for modules and less elegant for smaller computations.

The GHC api, which hint is implemented in terms of (the various plugin packages are, too), does not support concurrent use.
You can leave hint running, though. It's an instance of MonadIO.
interpreterLoop :: (MonadIO m, Typeable) a => Chan ((MVar a, String)) -> InterpreterT m ()
interpreterLoop ch = do
(mvar, command) <- liftIO $ readChan ch
a <- interpret command $ argTypeWitness mvar
liftIO $ putMVar mvar a
interpreterLoop ch
where
argTypeWitness :: MVar a -> a
argTypeWitness = undefined -- this value is only used for type checking, never evaluated
runInLoop :: Typeable a => Chan ((MVar a, String)) -> String -> IO a
runInLoop ch command = do
mvar <- newEmptyMVar
writeChan ch (mvar, command)
takeMVar mvar
(I didn't test this, so I may have missed a detail or two, but the basic idea will work.)

Related

Timeout on StateT with IO

I have a custom type
type GI a = StateT GenState IO a
where GenState is a state I keep for Generating Random Trees of some kind.
When generating my trees, termination is not guaranteed in a reasonable amount of time. That's why I thought I might terminate the calculation and restart it over and over again with a timeout until a result is given.
So my question is how to write a function of the form
tryGeneration :: GI a -> GI a
tryGeneraton action = ...
where action is the calculation to try in some microseconds and if it runs out of time begin the action from the start.
Please keep in mind that I'm quite new to Monad Transformers and I cannot say that i fully understand them yet.
I tried to use lift with System.Timeout.timeout and did not succeed
EDIT: thank you all for your suggestions. I followed them, and got it done in the IO monad.
tryGenerationTime :: Int -> GenState -> GI a -> IO (a, GenState)
tryGenerationTime time state action = do
(_, s') <- -- change the random state to not generate the same thing over and over
res <- timeout time (runStateT action s')
case res of
Nothing -> tryGenerationTime time s' action
Just r -> return r
timeItT :: Int -> GI a -> GI a
timeItT time action = do
state <- get
(x, s') <- lift $ tryGenerationTime time state action
put s'
return x
Any suggestion to improving this code is welcome. I just wanted to get it done fast, since that wasn't the solution to my generation problem and I needed to set a limit to the tree height to succeed.
I suspect what you actually want is more like
tryGeneration :: GI a -> IO a
tryGeneraton action = ...
in such a way that all of your "build a tree" actions have timeout-based retries.
The key thing to understand is that "attempt to do X; if you aren't done in n milliseconds, start over" is IO's job; IO is where you have access to things like time. (Of course there are wrappers you could and should use when you only need part of what IO has to offer.)
This is fine; you have access to IO in GI, you probably just have to lift it.
That said, there's not enough information here to say exactly how to do what you want, and I'm more familiar with free-monad effect systems than mtl transformers anyway...

How to limit code changes when introducing state?

I am a senior C/C++/Java/Assembler programmer and I have been always fascinated by the pure functional programming paradigm. From time to time, I try to implement something useful with it, e.g., a small tool, but often I quickly reach a point where I realize that I (and my tool, too) would be much faster in a non-pure language. It's probably because I have much more experience with imperative programming languages with thousands of idoms, patterns and typical solution approaches in my head.
Here is one of those situations. I have encountered it several times and I hope you guys can help me.
Let's assume I write a tool to simulate communication networks. One important task is the generation of network packets. The generation is quite complex, consisting of dozens of functions and configuration parameters, but at the end there is one master function and because I find it useful I always write down the signature:
generatePackets :: Configuration -> [Packet]
However, after a while I notice that it would be great if the packet generation would have some kind of random behavior deep down in one of the many sub-functions of the generation process. Since I need a random number generator for that (and I also need it at some other places in the code), this means to manually change dozens of signatures to something like
f :: Configuration -> RNGState [Packet]
with
type RNGState = State StdGen
I understand the "mathematical" necessity (no states) behind this. My question is on a higher (?) level: How would an experienced Haskell programmer have approached this situation? What kind of design pattern or work flow would have avoided the extra work later?
I have never worked with an experienced Haskell programmer. Maybe you will tell me that you never write signatures because you have to change them too often afterwards, or that you give all your functions a state monad, "just in case" :)
One approach that I've been fairly successful with is using a monad transformer stack. This lets you both add new effects when needed and also track the effects required by particular functions.
Here's a really simple example.
import Control.Monad.State
import Control.Monad.Reader
data Config = Config { v1 :: Int, v2 :: Int }
-- the type of the entire program describes all the effects that it can do
type Program = StateT Int (ReaderT Config IO) ()
runProgram program config startState =
runReaderT (runStateT program startState) config
-- doesn't use configuration values. doesn't do IO
step1 :: MonadState Int m => m ()
step1 = get >>= \x -> put (x+1)
-- can use configuration and change state, but can't do IO
step2 :: (MonadReader Config m, MonadState Int m) => m ()
step2 = do
x <- asks v1
y <- get
put (x+y)
-- can use configuration and do IO, but won't touch our internal state
step3 :: (MonadReader Config m, MonadIO m) => m ()
step3 = do
x <- asks v2
liftIO $ putStrLn ("the value of v2 is " ++ show x)
program :: Program
program = step1 >> step2 >> step3
main :: IO ()
main = do
let config = Config { v1 = 42, v2 = 123 }
startState = 17
result <- runProgram program config startState
return ()
Now if we want to add another effect:
step4 :: MonadWriter String m => m()
step4 = tell "done!"
program :: Program
program = step1 >> step2 >> step3 >> step4
Just adjust Program and runProgram
type Program = StateT Int (ReaderT Config (WriterT String IO)) ()
runProgram program config startState =
runWriterT $ runReaderT (runStateT program startState) config
To summarize, this approach lets us decompose a program in a way that tracks effects but also allows adding new effects as needed without a huge amount of refactoring.
edit:
It's come to my attention that I didn't answer the question about what to do for code that's already written. In many cases, it's not too difficult to change pure code into this style:
computation :: Double -> Double -> Double
computation x y = x + y
becomes
computation :: Monad m => Double -> Double -> m Double
computation x y = return (x + y)
This function will now work for any monad, but doesn't have access to any extra effects. Specifically, if we add another monad transformer to Program, then computation will still work.

Getting parallel IO while accounting for failure

I'm making several API calls that are encapsulated in a type alias:
type ConnectT a = EitherT String (RWST ConnectReader ConnectWriter ConnectState IO) a
Here's a simplified version of a function which connects to two separate APIs:
connectBoth :: ConnectT ()
connectBoth = do
a <- connectAPI SomeAPI someFunction
b <- connectAPI OtherAPI otherFunction
connectAPI OtherAPI (b `sendTo` a)
The final call in connectBoth is very time sensitive (and the transactions are of a financial nature). I figure a and b could be evaluated in parallel, and with lazy IO I should be able to do this:
b <- a `par` connectAPI OtherAPI otherFunction
The documentation for par says that it Indicates that it may be beneficial to evaluate the first argument in parallel with the second.
Does this work with IO?
Can I get any more guaranteed than "it may be beneficial?"
Or if I want greater guarantees will I need to use an MVar and liftIO . forkIO?
If I evaluate a first, I think I can use eitherT to check if a succeeded. But if I evaluate both at the same time I get confused. Here is the situation:
If only a failed, I will retry a, if that fails I will run a function that manually reverses b
If only b failed, I will retry b, write to the log in RWS and return left
if both fail write to the log in RWS and return left
if both succeed process c (which is not as time sensitive as a or b)
But if I evaluate both in parallel, then how can I identify which one failed? If I use eitherT immediately after a then a will evaluate first. If I use it after b then I won't be able to tell which one failed.
Is there a way I can evaluate the IO calls in parallel but respond differently depending on which one (if any) fails? Or am I left with a choice of parallelism vs failure mitigation?
The solution you are looking for will use forkIO and MVars.
par
par is for multiprocessor parallelism, it helps evaluate terms in parallel. It doesn't help with IO. If you do
do
a <- (someProcess :: IO a)
...
By the time you reach ... everything from the IO action has happened (if we ignore evil lazy IO) to a point that a can be determined entirely by ordinary evaluation. This means that by the time you do b <- someOtherProcess, all of someProcess is already done. It's too late to do anything in parallel.
EitherT
You can explicitly examine the Either e a result of an EitherT e m a. runEitherT :: EitherT e m a -> m (Either e a) makes the success or failure explicit in the underlying monad. We can lift that right back into EitherT to make a computation that always succeeds (sometimes with an error) from one that sometimes fails.
import Control.Monad.Trans.Class
examine :: (MonadTrans t, Monad m) => EitherT e m a -> t m (Either e a)
examine = lift . runEitherT
forkIO
The simplest solution for doing two things in IO is forkIO. It starts another lightweight thread that you can forget about.
If you run a value with your transformer stack, there will be four pieces of data when you are done. The state ConnectState, the written ConnectWriter log, whether the computation was successful, and, depending on whether or not it was successful, either the value or the error.
EitherT String (RWST ConnectReader ConnectWriter ConnectState IO) a
^ ^ ^ ^ ^
If we write out the structure of this, it looks like
(RWST ConnectReader ConnectWriter ConnectState IO) (Either String a)
^ ^ ^ ^ ^
ConnectReader -> ConnectState -> IO (Either String a, ConnectState, ConnectWriter)
^ ^ ^ ^ ^
All four of those pieces of information end up in the result of the IO action. If you fork your stack, you need to decide what to do with all of them when you join the results back together. You have already decided that you want to explicitly handle the Either String a. The ConnectWriters can probably be combined together with <>. You will need to decide what to do with ConnectState.
We'll make a fork that returns all four of these pieces of data by shoving them into an MVar.
import Control.Concurrent
import Control.Concurrent.MVar
import Control.Monad.IO.Class
forkConnectT :: ConnectT a -> ConnectT (MVar (Either String a, ConnectState, ConnectWriter))
forkConnectT cta = do
result <- liftIO newEmptyMVar
r <- lift ask
s <- lift get
liftIO $ forkIO $ do
state <- runRWST (runEitherT cta) r s
putMVar result state
return result
Later, when we want the result, we can try and see if it is ready. We'll handle the Either for success and failure explicitly, while handling the state and writer behind the scenes.
import Data.Traversable
tryJoinConnectT :: MVar (Either String a, ConnectState, ConnectWriter) -> ConnectT (Maybe (Either String a))
tryJoinConnectT result = liftIO (tryTakeMVar result) >>= traverse reintegrate
Behind the scenes we reintegrate the ConnectWriter by telling this ConnectT to write what was accumulated in the other thread. You will need to decide what to do to combine the two states.
reintegrate :: (a, ConnectState, ConnectWriter) -> ConnectT a
reintegrate (a, s, w) = do
-- Whatever needs to be done with the state.
-- stateHere <- lift get
lift $ tell w
return a
If we want to wait until the result is ready, we can block reading the MVar. This offers less opportunity for handling errors such as timeouts.
joinConnectT :: MVar (Either String a, ConnectState, ConnectWriter) -> ConnectT (Either String a)
joinConnectT result = liftIO (takeMVar result) >>= reintegrate
Example
Putting it all together, we can fork a task in parallel, do something in this thread explicitly examining the success or failure, join with the result from the other thread, and reason about what to do next with explicit Eithers representing success or failure from each process.
connectBoth :: ConnectT ()
connectBoth = do
bVar <- forkConnectT $ connectAPI OtherAPI otherFunction
a <- examine $ connectAPI SomeAPI someFunction
b <- joinConnectT bVar
...
Going farther
If you are paranoid, you will also want to handle exceptions (some of which can be handled by forkFinally) and asynchronous exceptions. You will need to decide whether to bundle these exceptions into your stack or treat IO like it can always throw exceptions.
Consider using async instead of forkIO and MVars.
monad-control (which you already have dependencies on via either) provides mechanisms for building up, one transformer at a time, the type that represents the state of a monad transformer stack. We wrote this by hand as (Either String a, ConnectState, ConnectWriter). If you are going to grow your transformer stack, you might want to get this from MonadTransControl instead. You can restore the state from the forked thread(see MonadBaseControl section) in the parent to inspect it. You will still need to decide how to deal with the data from the two states..

Is there a lazy Session IO Monad?

You have a sequence of actions that prefer to be executed in chunks due to some high-fixed overhead like packet headers or making connections. The limit is that sometimes the next action depends on the result of a previous one in which case, all pending actions are executed at once.
Example:
mySession :: Session IO ()
a <- readit -- nothing happens yet
b <- readit -- nothing happens yet
c <- readit -- nothing happens yet
if a -- all three readits execute because we need a
then write "a"
else write "..."
if b || c -- b and c already available
...
This reminds me of so many Haskell concepts but I can't put my finger on it.
Of course, you could do something obvious like:
[a,b,c] <- batch([readit, readit, readit])
But I'd like to hide the fact of chunking from the user for slickness purposes.
Not sure if Session is the right word. Maybe you can suggest a better one? (Packet, Batch, Chunk and Deferred come to mind.)
Update
I think there was a really good answer last night that I read on my phone but when I came back to look for it today it was gone. Was I dreaming?
I don't think you can do exactly what you want, since what you describe exploits haskell's lazy evaluation to have the evaluation of a force the actions that compute b and c, and there's no way to seq on unspecified values.
What I could do was hack together a monad transformer that delayed actions sequenced via >> so that they could be executed all together:
data Session m a = Session { pending :: [ m () ], final :: m a }
runSession :: Monad m => Session m a -> m a
runSession (Session ms ma) = foldr (flip (>>)) (return ()) ms >> ma
instance Monad m => Monad (Session m) where
return = Session [] . return
s >>= f = Session [] $ runSession s >>= (runSession . f)
(Session ms ma) >> (Session ms' ma') =
Session (ms' ++ (ma >> return ()) : ms) ma'
This violates some monad laws, but lets you do something like:
liftIO :: IO a -> Session IO a
liftIO = Session []
exampleSession :: Session IO Int
exampleSession = do
liftIO $ putStrLn "one"
liftIO $ putStrLn "two"
liftIO $ putStrLn "three"
liftIO $ putStrLn "four"
trace "five" $ return 5
and get
ghci> runSession exampleSession
five
one
two
three
four
5
ghci> length (pending exampleSession)
4
This is very similar to what Haxl does.
For more info:
Open sourcing haxl - Facebook Code Blog
ICFP 2014 talk
You could use the unsafeInterleaveIO function. It is a dangerous function that can introduce bugs to your program if not used carefully, but it does what you're asking for.
You can insert it into your example code like this:
lazyReadits :: IO [a]
lazyReadits = unsafeInterleaveIO $ do
a <- readit
r <- lazyReadits
return (a:r)
unsafeInterleaveIO makes the action as a whole lazy, but once it starts evaluating it will evaluate as if it had been strict. This means in my above example: readit will run as soon as something tests whether the returned list is empty or not. If I'd used mapM unsafeInterleaveIO (replicate 3 readit) instead, then readit would only be run when the actual elements of the list are evaluated, which would make the contents of the list depend on the order in which its elements are inspected, which is one example of how unsafeInterleaveIO can introduce bugs.

How do I create a thread pool?

Sometimes I want to run a maximum amount of IO actions in parallel at once for network-activity, etc. I whipped up a small concurrent thread function which works well with https://gist.github.com/810920, but this isn't really a pool as all IO actions must finish before others can start.
The type of what I'm looking for would be something like:
runPool :: Int -> [IO a] -> IO [a]
and should be able to operate on finite and infinite lists.
The pipes package looks like it would be able to achieve this quite well, but I feel there is probably a similar solution to the gist I have provided just using mvars, etc, from the haskell-platform.
Has anyone encountered an idiomatic solution without any heavy dependencies?
You need a thread pool, if you want something short, you could get inspiration from Control.ThreadPool (from the control-engine package which also provide more general functions), for instance threadPoolIO is just :
threadPoolIO :: Int -> (a -> IO b) -> IO (Chan a, Chan b)
threadPoolIO nr mutator = do
input <- newChan
output <- newChan
forM_ [1..nr] $
\_ -> forkIO (forever $ do
i <- readChan input
o <- mutator i
writeChan output o)
return (input, output)
It use two Chan for communication with the outside but that's usually what you want, it really help writing code that don't mess up.
If you absolutely want to wrap it up in a function of your type you can encapsulate the communication too :
runPool :: Int -> [IO a] -> IO [a]
runPool n as = do
(input, output) <- threadPoolIO n (id)
forM_ as $ writeChan input
sequence (repeat (length as) $ readChan output)
This won't keep the order of your actions, is that a problem (it's easy to correct by transmitting the index of the action or just using an array instead to store the responses) ?
Note : the n threads will stay alive forever with this simplistic version, adding a "killAll" returned action to threadPoolIO would resolve this problem handily if you intend to create and trash several of those pool in a long running application (if not, given the weight of threads in Haskell, it's probably not worth the bother).
Note that this function works on finite lists only, that's because IO is normally strict so you can't start to process elements of IO [a] before the whole list is produced, if you really want that you'll have either to use lazy IO with unsafeInterleaveIO (maybe not the best idea) or completely change your model and use something like conduits to stream your results.

Resources