Apparent redundant calls in a IO monad?

Apparent redundant calls in a IO monad? - haskell

Here is a snippet of code taken from the Haskell GPipe project (commented by myself, save the line with "Really?"). In the memoize function, I don't understand why its author call the getter a second time to cache a newly computed value. It doesn't seem necessary to me and it can be removed without apparent bad consequences (at least, a medium-sized project of mine still works without it).
{- | A map (SN stands for stable name) to cache the results 'a' of computations 'm a'.
The type 'm' ends up being constrained to 'MonadIO m' in the various functions using it.
-}
newtype SNMap m a = SNMap (HT.BasicHashTable (StableName (m a)) a)
newSNMap :: IO (SNMap m a)
newSNMap = SNMap <$> HT.new
memoize :: MonadIO m
=> m (SNMap m a) -- ^ A "IO call" to retrieve our cache.
-> m a -- ^ The "IO call" to execute and cache the result.
-> m a -- ^ The result being naturally also returned.
memoize getter m = do
s <- liftIO $ makeStableName $! m -- Does forcing the evaluation make sense here (since we try to avoid it...)?
SNMap h <- getter
x <- liftIO $ HT.lookup h s
case x of
Just a -> return a
Nothing -> do
a <- m
SNMap h' <- getter -- Need to redo because of scope. <- Really?
liftIO $ HT.insert h' s a
return a

I get it. The scope term used is not related to the Haskell 'do' scope. It is simply that a computation could recursively update the cache when evaluated (as in the scopedM function in the same module...). It is kind of obvious in retrospect.

Related

The ContT Monad: Putting the pieces together

Preamble
I am trying to wrap my head around how to actually use ContT and callCC for something useful. I'm having trouble following information and control flows around the the code. (but, isn't that the point of a continuation?)
There are a lot of different ways to move pieces around with this monad and a small handful of not very straight forward combinators. I will confess that I'm still uncomfortable with my understanding of how ContT works, but I will point to what I have read so far:
Haskell/Continuation passing style
How and why does the Haskell Cont monad work?
Understanding Haskell callCC examples
Goto in Haskell: Can anyone explain this seemingly insane effect of continuation monad usage?
How to interpret callCC in Haskell?
Parsec Generally (the article that started me down this path)
What I would like to do is post a psudo-code example then ask some questions about it. This represents the typical look of code using ContT
Psudo-Code
type MyMonad r = ContT r (State SomeState)
main = do
runState s_init $ runContT block print
block :: MyMonad r a0
block = do
before_callcc
output <- callCC $ \k -> do
rval <- inner_block
return rval
after_callcc
Questions
What determines the value and type of output?
What does the b mean in the type of k?
Where does the value given to k go?
When is inner_block run? What version of the state does it see?
Where does rval go and what its type?
What is the relationship between k and rval?
What happens when I apply k a) in inner_block, b) in after_callcc, c) outside of block?
What is the version of the state in each of the above?
What do I need to do to get k out of block?
Can I put k into the state?
Color Coded for easier reading

What determines the value and type of output?
It will be of the same type as rval. The value will also be the same, unless inner_block uses k someValue to escape the rest of the block. In that case, output will be someValue.
What does the b mean in the type of k?
Roughly, b can be understood as "anything at all". That is, if inner_block is
...
v <- k someValue
use v
then v :: b. However, k someValue will never run the rest of the block since it will exit the callCC immediately. So, no concrete value for v will be ever returned. Because of this v can have any type: it does not matter if use requires a String or an Int -- it's not being executed anyway.
Where does the value given to k go?
As soon as the inner block runs k someValue the value is returned by the callCC as output, and the rest of the block is skipped.
When is inner_block run? What version of the state does it see?
It is run as callCC is called, and sees the same state as we had at that point.
Where does rval go and what its type?
Into output. Same type.
What is the relationship between k and rval?
rval has the same type as the argument of k.
What happens when I apply k a) in inner_block, b) in after_callcc, c) outside of block?
a) See above.
b) k is out of scope in after_callcc.
c) Also out of scope.
What is the version of the state in each of the above?
The state is the "current" one. (I am not sure what you exactly are asking for here)
What do I need to do to get k out of block?
A recursive type is needed here, I guess. Here's an attempt:
import Control.Monad.Cont
import Control.Monad.State
import Control.Monad.Trans
type SomeState = String
type MyMonad r = ContT r (State SomeState)
newtype K a b r = K ((K a b r, a) -> MyMonad r b)
main = do
print $ flip runState "init" $ runContT block return
block :: MyMonad r Int
block = do
lift $ modify (++ ":start")
(K myK, output) <- callCC $ \k -> do
s <- lift $ get
lift $ modify (++ ":inner(" ++ show (length s) ++")")
return (K k, 10)
lift $ modify (++ ":output=" ++ show output)
s <- lift $ get
when (length s <50) $ myK (K myK, output+1)
return 5
This prints
(5,"init:start:inner(10):output=10:output=11:output=12")
Can I put k into the state?
I believe you need a recursive type for that, too.

Getting parallel IO while accounting for failure

I'm making several API calls that are encapsulated in a type alias:
type ConnectT a = EitherT String (RWST ConnectReader ConnectWriter ConnectState IO) a
Here's a simplified version of a function which connects to two separate APIs:
connectBoth :: ConnectT ()
connectBoth = do
a <- connectAPI SomeAPI someFunction
b <- connectAPI OtherAPI otherFunction
connectAPI OtherAPI (b `sendTo` a)
The final call in connectBoth is very time sensitive (and the transactions are of a financial nature). I figure a and b could be evaluated in parallel, and with lazy IO I should be able to do this:
b <- a `par` connectAPI OtherAPI otherFunction
The documentation for par says that it Indicates that it may be beneficial to evaluate the first argument in parallel with the second.
Does this work with IO?
Can I get any more guaranteed than "it may be beneficial?"
Or if I want greater guarantees will I need to use an MVar and liftIO . forkIO?
If I evaluate a first, I think I can use eitherT to check if a succeeded. But if I evaluate both at the same time I get confused. Here is the situation:
If only a failed, I will retry a, if that fails I will run a function that manually reverses b
If only b failed, I will retry b, write to the log in RWS and return left
if both fail write to the log in RWS and return left
if both succeed process c (which is not as time sensitive as a or b)
But if I evaluate both in parallel, then how can I identify which one failed? If I use eitherT immediately after a then a will evaluate first. If I use it after b then I won't be able to tell which one failed.
Is there a way I can evaluate the IO calls in parallel but respond differently depending on which one (if any) fails? Or am I left with a choice of parallelism vs failure mitigation?

The solution you are looking for will use forkIO and MVars.
par
par is for multiprocessor parallelism, it helps evaluate terms in parallel. It doesn't help with IO. If you do
do
a <- (someProcess :: IO a)
...
By the time you reach ... everything from the IO action has happened (if we ignore evil lazy IO) to a point that a can be determined entirely by ordinary evaluation. This means that by the time you do b <- someOtherProcess, all of someProcess is already done. It's too late to do anything in parallel.
EitherT
You can explicitly examine the Either e a result of an EitherT e m a. runEitherT :: EitherT e m a -> m (Either e a) makes the success or failure explicit in the underlying monad. We can lift that right back into EitherT to make a computation that always succeeds (sometimes with an error) from one that sometimes fails.
import Control.Monad.Trans.Class
examine :: (MonadTrans t, Monad m) => EitherT e m a -> t m (Either e a)
examine = lift . runEitherT
forkIO
The simplest solution for doing two things in IO is forkIO. It starts another lightweight thread that you can forget about.
If you run a value with your transformer stack, there will be four pieces of data when you are done. The state ConnectState, the written ConnectWriter log, whether the computation was successful, and, depending on whether or not it was successful, either the value or the error.
EitherT String (RWST ConnectReader ConnectWriter ConnectState IO) a
^ ^ ^ ^ ^
If we write out the structure of this, it looks like
(RWST ConnectReader ConnectWriter ConnectState IO) (Either String a)
^ ^ ^ ^ ^
ConnectReader -> ConnectState -> IO (Either String a, ConnectState, ConnectWriter)
^ ^ ^ ^ ^
All four of those pieces of information end up in the result of the IO action. If you fork your stack, you need to decide what to do with all of them when you join the results back together. You have already decided that you want to explicitly handle the Either String a. The ConnectWriters can probably be combined together with <>. You will need to decide what to do with ConnectState.
We'll make a fork that returns all four of these pieces of data by shoving them into an MVar.
import Control.Concurrent
import Control.Concurrent.MVar
import Control.Monad.IO.Class
forkConnectT :: ConnectT a -> ConnectT (MVar (Either String a, ConnectState, ConnectWriter))
forkConnectT cta = do
result <- liftIO newEmptyMVar
r <- lift ask
s <- lift get
liftIO $ forkIO $ do
state <- runRWST (runEitherT cta) r s
putMVar result state
return result
Later, when we want the result, we can try and see if it is ready. We'll handle the Either for success and failure explicitly, while handling the state and writer behind the scenes.
import Data.Traversable
tryJoinConnectT :: MVar (Either String a, ConnectState, ConnectWriter) -> ConnectT (Maybe (Either String a))
tryJoinConnectT result = liftIO (tryTakeMVar result) >>= traverse reintegrate
Behind the scenes we reintegrate the ConnectWriter by telling this ConnectT to write what was accumulated in the other thread. You will need to decide what to do to combine the two states.
reintegrate :: (a, ConnectState, ConnectWriter) -> ConnectT a
reintegrate (a, s, w) = do
-- Whatever needs to be done with the state.
-- stateHere <- lift get
lift $ tell w
return a
If we want to wait until the result is ready, we can block reading the MVar. This offers less opportunity for handling errors such as timeouts.
joinConnectT :: MVar (Either String a, ConnectState, ConnectWriter) -> ConnectT (Either String a)
joinConnectT result = liftIO (takeMVar result) >>= reintegrate
Example
Putting it all together, we can fork a task in parallel, do something in this thread explicitly examining the success or failure, join with the result from the other thread, and reason about what to do next with explicit Eithers representing success or failure from each process.
connectBoth :: ConnectT ()
connectBoth = do
bVar <- forkConnectT $ connectAPI OtherAPI otherFunction
a <- examine $ connectAPI SomeAPI someFunction
b <- joinConnectT bVar
...
Going farther
If you are paranoid, you will also want to handle exceptions (some of which can be handled by forkFinally) and asynchronous exceptions. You will need to decide whether to bundle these exceptions into your stack or treat IO like it can always throw exceptions.
Consider using async instead of forkIO and MVars.
monad-control (which you already have dependencies on via either) provides mechanisms for building up, one transformer at a time, the type that represents the state of a monad transformer stack. We wrote this by hand as (Either String a, ConnectState, ConnectWriter). If you are going to grow your transformer stack, you might want to get this from MonadTransControl instead. You can restore the state from the forked thread(see MonadBaseControl section) in the parent to inspect it. You will still need to decide how to deal with the data from the two states..

Abstraction for monadic recursion with "unless"

I'm trying to work out if it's possible to write an abstraction for the following situation. Suppose I have a type a with function a -> m Bool e.g. MVar Bool and readMVar. To abstract this concept out I create a newtype wrapper for the type and its function:
newtype MPredicate m a = MPredicate (a,a -> m Bool)
I can define a fairly simple operation like so:
doUnless :: (Monad m) => Predicate m a -> m () -> m ()
doUnless (MPredicate (a,mg)) g = mg a >>= \b -> unless b g
main = do
b <- newMVar False
let mpred = MPredicate (b,readMVar)
doUnless mpred (print "foo")
In this case doUnless would print "foo". Aside: I'm not sure whether a type class might be more appropriate to use instead of a newtype.
Now take the code below, which outputs an incrementing number then waits a second and repeats. It does this until it receives a "turn off" instruction via the MVar.
foobar :: MVar Bool -> IO ()
foobar mvb = foobar' 0
where
foobar' :: Int -> IO ()
foobar' x = readMVar mvb >>= \b -> unless b $ do
let x' = x + 1
print x'
threadDelay 1000000
foobar' x'
goTillEnter :: MVar Bool -> IO ()
goTillEnter mv = do
_ <- getLine
_ <- takeMVar mv
putMVar mv True
main = do
mvb <- newMVar False
forkIO $ foobar mvb
goTillEnter mvb
Is it possible to refactor foobar so that it uses MPredicate and doUnless?
Ignoring the actual implementation of foobar' I can think of a simplistic way of doing something similar:
cycleUnless :: x -> (x -> x) -> MPredicate m a -> m ()
cycleUnless x g mp = let g' x' = doUnless mp (g' $ g x')
in g' $ g x
Aside: I feel like fix could be used to make the above neater, though I still have trouble working out how to use it
But cycleUnless won't work on foobar because the type of foobar' is actually Int -> IO () (from the use of print x').
I'd also like to take this abstraction further, so that it can work threading around a Monad. With stateful Monads it becomes even harder. E.g.
-- EDIT: Updated the below to show an example of how the code is used
{- ^^ some parent function which has the MVar ^^ -}
cycleST :: (forall s. ST s (STArray s Int Int)) -> IO ()
cycleST sta = readMVar mvb >>= \b -> unless b $ do
n <- readMVar someMVar
i <- readMVar someOtherMVar
let sta' = do
arr <- sta
x <- readArray arr n
writeArray arr n (x + i)
return arr
y = runSTArray sta'
print y
cycleST sta'
I have something similar to the above working with RankNTypes. Now there's the additional problem of trying to thread through the existential s, which is not likely to type check if threaded around through an abstraction the likes of cycleUnless.
Additionally, this is simplified to make the question easier to answer. I also use a set of semaphores built from MVar [MVar ()] similar to the skip channel example in the MVar module. If I can solve the above problem I plan to generalize the semaphores as well.
Ultimately this isn't some blocking problem. I have 3 components of the application operating in a cycle off the same MVar Bool but doing fairly different asynchronous tasks. In each one I have written a custom function that performs the appropriate cycle.
I'm trying to learn the "don't write large programs" approach. What I'd like to do is refactor chunks of code into their own mini libraries so that I'm not building a large program but assembling lots of small ones. But so far this particular abstraction is escaping me.
Any thoughts on how I might go about this are very much appreciated!

You want to cleanly combine a stateful action having side effects, a delay, and an independent stopping condition.
The iterative monad transformer from the free package can be useful in these cases.
This monad transformer lets you describe a (possibly nonending) computation as a series of discrete steps. And what's better, it let's you interleave "stepped" computations using mplus. The combined computation stops when any of the individual computations stops.
Some preliminary imports:
import Data.Bool
import Control.Monad
import Control.Monad.Trans
import Control.Monad.Trans.Iter (delay,untilJust,IterT,retract,cutoff)
import Control.Concurrent
Your foobar function could be understood as a "sum" of three things:
A computation that does nothing but reading from the MVar at each step, and finishes when the Mvar is True.
untilTrue :: (MonadIO m) => MVar Bool -> IterT m ()
untilTrue = untilJust . liftM guard . liftIO . readMVar
An infinite computation that takes a delay at each step.
delays :: (MonadIO m) => Int -> IterT m a
delays = forever . delay . liftIO . threadDelay
An infinite computation that prints an increasing series of numbers.
foobar' :: (MonadIO m) => Int -> IterT m a
foobar' x = do
let x' = x + 1
liftIO (print x')
delay (foobar' x')
With this in place, we can write foobar as:
foobar :: (MonadIO m) => MVar Bool -> m ()
foobar v = retract (delays 1000000 `mplus` untilTrue v `mplus` foobar' 0)
The neat thing about this is that you can change or remove the "stopping condition" and the delay very easily.
Some clarifications:
The delay function is not a delay in IO, it just tells the iterative monad transformer to "put the argument in a separate step".
retract brings you back from the iterative monad transformer to the base monad. It's like saying "I don't care about the steps, just run the computation". You can combine retract with cutoff if you want to limit the maximum number of iterations.
untilJustconverts a value m (Maybe a) of the base monad into a IterT m a by retrying in each step until a Just is returned. Of course, this risks non-termination!

MPredicate is rather superfluous here; m Bool can be used instead. The monad-loops package contains plenty of control structures with m Bool conditions. whileM_ in particular is applicable here, although we need to include a State monad for the Int that we're threading around:
import Control.Monad.State
import Control.Monad.Loops
import Control.Applicative
foobar :: MVar Bool -> IO ()
foobar mvb = (`evalStateT` (0 :: Int)) $
whileM_ (not <$> lift (readMVar mvb)) $ do
modify (+1)
lift . print =<< get
lift $ threadDelay 1000000
Alternatively, we can use a monadic version of unless. For some reason monad-loops doesn't export such a function, so let's write it:
unlessM :: Monad m => m Bool -> m () -> m ()
unlessM mb action = do
b <- mb
unless b action
It's somewhat more convenient and more modular in a monadic setting, since we can always go from a pure Bool to m Bool, but not vice versa.
foobar :: MVar Bool -> IO ()
foobar mvb = go 0
where
go :: Int -> IO ()
go x = unlessM (readMVar mvb) $ do
let x' = x + 1
print x'
threadDelay 1000000
go x'
You mentioned fix; sometimes people indeed use it for ad-hoc monadic loops, for example:
printUntil0 :: IO ()
printUntil0 =
putStrLn "hello"
fix $ \loop -> do
n <- fmap read getLine :: IO Int
print n
when (n /= 0) loop
putStrLn "bye"
With some juggling it's possible to use fix with multi-argument functions. In the case of foobar:
foobar :: MVar Bool -> IO ()
foobar mvb = ($(0 :: Int)) $ fix $ \loop x -> do
unlessM (readMVar mvb) $ do
let x' = x + 1
print x'
threadDelay 1000000
loop x'

I'm not sure what's your MPredicate is doing.
First, instead of newtyping a tuple, it's probably better to use a normal algebric data type
data MPredicate a m = MPredicate a (a -> m Bool)
Second, the way you use it, MPredicate is equivalent to m Bool.
Haskell is lazzy, therefore there is no need to pass, a function and it's argument (even though
it's usefull with strict languages). Just pass the result, and the function will be called when needed.
I mean, instead of passing (x, f) around, just pass f x
Of course, if you are not trying to delay the evaluation and really need at some point, the argument or the function as well as the result, a tuple is fine.
Anyway, in the case your MPredicate is only there to delay the function evaluation, MPredicat reduces to m Bool and doUnless to unless.
Your first example is strictly equivalent :
main = do
b <- newMVar False
unless (readMVar b) (print "foo")
Now, if you want to loop a monad until a condition is reach (or equivalent) you should have a look at the monad-loop package. What you are looking it at is probably untilM_ or equivalent.

Is there a lazy Session IO Monad?

You have a sequence of actions that prefer to be executed in chunks due to some high-fixed overhead like packet headers or making connections. The limit is that sometimes the next action depends on the result of a previous one in which case, all pending actions are executed at once.
Example:
mySession :: Session IO ()
a <- readit -- nothing happens yet
b <- readit -- nothing happens yet
c <- readit -- nothing happens yet
if a -- all three readits execute because we need a
then write "a"
else write "..."
if b || c -- b and c already available
...
This reminds me of so many Haskell concepts but I can't put my finger on it.
Of course, you could do something obvious like:
[a,b,c] <- batch([readit, readit, readit])
But I'd like to hide the fact of chunking from the user for slickness purposes.
Not sure if Session is the right word. Maybe you can suggest a better one? (Packet, Batch, Chunk and Deferred come to mind.)
Update
I think there was a really good answer last night that I read on my phone but when I came back to look for it today it was gone. Was I dreaming?

I don't think you can do exactly what you want, since what you describe exploits haskell's lazy evaluation to have the evaluation of a force the actions that compute b and c, and there's no way to seq on unspecified values.
What I could do was hack together a monad transformer that delayed actions sequenced via >> so that they could be executed all together:
data Session m a = Session { pending :: [ m () ], final :: m a }
runSession :: Monad m => Session m a -> m a
runSession (Session ms ma) = foldr (flip (>>)) (return ()) ms >> ma
instance Monad m => Monad (Session m) where
return = Session [] . return
s >>= f = Session [] $ runSession s >>= (runSession . f)
(Session ms ma) >> (Session ms' ma') =
Session (ms' ++ (ma >> return ()) : ms) ma'
This violates some monad laws, but lets you do something like:
liftIO :: IO a -> Session IO a
liftIO = Session []
exampleSession :: Session IO Int
exampleSession = do
liftIO $ putStrLn "one"
liftIO $ putStrLn "two"
liftIO $ putStrLn "three"
liftIO $ putStrLn "four"
trace "five" $ return 5
and get
ghci> runSession exampleSession
five
one
two
three
four
5
ghci> length (pending exampleSession)
4

This is very similar to what Haxl does.
For more info:
Open sourcing haxl - Facebook Code Blog
ICFP 2014 talk

You could use the unsafeInterleaveIO function. It is a dangerous function that can introduce bugs to your program if not used carefully, but it does what you're asking for.
You can insert it into your example code like this:
lazyReadits :: IO [a]
lazyReadits = unsafeInterleaveIO $ do
a <- readit
r <- lazyReadits
return (a:r)
unsafeInterleaveIO makes the action as a whole lazy, but once it starts evaluating it will evaluate as if it had been strict. This means in my above example: readit will run as soon as something tests whether the returned list is empty or not. If I'd used mapM unsafeInterleaveIO (replicate 3 readit) instead, then readit would only be run when the actual elements of the list are evaluated, which would make the contents of the list depend on the order in which its elements are inspected, which is one example of how unsafeInterleaveIO can introduce bugs.

How do I abstract this pattern in haskell?

Scenario: I have an interpreter that builds up values bottom-up from an AST. Certain nodes come with permissions -- additional boolean expressions. Permission failures should propagate, but if a node above in the AST comes with a permission, a success can recover the computation and stop the propagation of the error.
At first I thought the Error MyError MyValue monad would be enough: one of the members of MyError could be PermError, and I could use catchError to recover from PermError if the second check succeeds. However, MyValue is gone by the time I get to the handler. I guess there could ultimately be a way of having PermError carry a MyValue field so that the handler could restore it, but it would probably be ugly and checking for an exception at each step would defeat the concept of an exceptional occurrence.
I'm trying to think of an alternative abstraction. Basically I have to return a datatype Either AllErrorsExceptPermError (Maybe PermError, MyValue) or more simply (Maybe AllErrors, MyValue) (the other errors are unrecoverable and fit the error monad pretty well) and I'm looking for something that would save me from juggling the tuple around, since there seems to be a common pattern in how the operations are chained. My haskell knowledge only goes so far. How would you use haskell to your advantage in this situation?
While I write this I came up with an idea (SO is a fancy rubber duck): a Monad that that handles internally a type (a, b) (and ultimately returns it when the monadic computation terminates, there has to be some kind of runMyMonad), but lets me work with the type b directly as much as possible. Something like
data T = Pass | Fail | Nothing
instance Monad (T , b) where
return v = (Nothing, v)
(Pass, v) >>= g = let (r', v') = g v in (if r' == Fail then Fail else Pass, v')
(Fail, v) >>= g = let (r', v') = g v in (if r' == Pass then Pass else Fail, v')
(Nothing, _) >>= g = error "This should not have been propagated, all chains should start with Pass or Fail"
errors have been simplified into T, and the instance line probably has a syntax error, but you should get the idea. Does this make sense?

I think you can use State monad for permissions and value calculation and wrap that inside ErrorT monad transformer to handle the errors. Below is such an example which shows the idea , here the calculation is summing up a list, permissions are number of even numbers in the list and error condition is when we see 0 in the list.
import Control.Monad.Error
import Control.Monad.State
data ZeroError = ZeroError String
deriving (Show)
instance Error ZeroError where
fun :: [Int] -> ErrorT ZeroError (State Int) Int
fun [] = return 0
fun (0:xs) = throwError $ ZeroError "Zero found"
fun (x:xs) = do
i <- get
put $ (if even(x) then i+1 else i)
z <- fun xs
return $ x+z
main = f $ runState (runErrorT $ fun [1,2,4,5,10]) 0
where
f (Left e,evens) = putStr $ show e
f (Right r,evens) = putStr $ show (r,evens)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Apparent redundant calls in a IO monad? - haskell

I get it. The scope term used is not related to the Haskell 'do' scope. It is simply that a computation could recursively update the cache when evaluated (as in the scopedM function in the same module...). It is kind of obvious in retrospect.

Related

The ContT Monad: Putting the pieces together

Getting parallel IO while accounting for failure

Abstraction for monadic recursion with "unless"

Is there a lazy Session IO Monad?

How do I abstract this pattern in haskell?

Categories

Resources