How do you reason about the order of execution of functions in a monadT stack?

How do you reason about the order of execution of functions in a monadT stack? - haskell

General theme: While I find the idea of stacking monads together is very appealing, I am having a lot of trouble picturing how the code is executed, and what are the appropriate orders to run the layers. Below is one example of a stack: Writer, State, State, and Error, in no particular order ( or is there? ).
-----------------------
-- Utility Functions --
-----------------------
type Memory = Map String Int
type Counter = Int
type Log = String
tick :: (MonadState Counter m) => m ()
tick = modify (+1)
record :: (MonadWriter Log m) => Log -> m ()
record msg = tell $ msg ++ "; "
------------------
-- MonadT Stack --
------------------
mStack :: ( MonadTrans t, MonadState Memory m, MonadState Counter (t m), MonadError ErrMsg (t m), MonadWriter Log (t m) ) => t m Int
mStack = do
tick
m <- lift get
let x = fromJust ( M.lookup "x" m ) in x
record "accessed memory"
case True of
True -> return 100
False -> throwError "false"
Please note in mStack, whether an error is thrown or not has nothing to do with any other part of the function.
Now ideally I want the output to look like this:
( Right 100, 1, "accessed memory", fromList [...])
or in general:
( output of errorT, output of stateT Counter, output of writerT, output of StateT Memory )
But I cannot get it to work. Specifically, I tried running the stack as if Error is on the outermost layer:
mem1 = M.fromList [("x",10),("y",5)]
runIdentity $ runWriterT (runStateT (runStateT (runErrorT mStack ) 0 ) mem1 ) ""
But am getting this error message:
Couldn't match type `Int' with `Map [Char] Int'
The above instance aside, in general, when I am calling:
runMonadT_1 ( runMonadT_2 expr param2 ) param1,
are the functions relating to monadT_2 run first, then that output is piped into the functions relating to monadT_1 ? So in other words, as imperative as the code looks in the above function mStack, is the order of execution entirely dependent upon the order in which the monadT are run ( aside from any rigidness in structure introduced by lift ) ?

You would have gotten a more informative type error if you had tried to type your computation using an explicit monad transformer stack:
mStack :: ErrorT String (StateT (Map String Int) (StateT Int Writer)) Int
Had you done that, ghc would have caught the type error earlier. The reason is that you use the following two commands within mStack at the top-most level:
modify (+1) -- i.e. from `tick`
...
yourMap <- lift get
If you were to give this an explicit stack, then you'd catch the mistake: both modify and lift get are going to target the first StateT layer they encounter, which happens to be the same StateT layer.
modify begins from the ErrorT layer and proceeds downward until it hits the outer StateT layer, and concludes that the outer StateT must be using an Int state. get begins from the outer StateT layer, notices that it's already in a StateT layer and ignores the inner StateT layer entirely, so it concludes that the outer StateT layer must be storing a Map.
ghc then says "What gives? This layer can't be storing both an Int and a Map!", which explains the type error you got. However, because you used type classes instead of a concrete monad transformer stack, there was no way that ghc could know that this was a type error in waiting until you specified a concrete stack.
The solution is simple: just add another lift to your get and it will now target the inner StateT layer like you intended.
I personally prefer to avoid mtl classes entirely and always work with a concrete monad transformer stack using the transformers library alone. It's more verbose because you have to be precise about which layer you want using lift, but it causes fewer headaches down the road.

Related

How to understand `MonadUnliftIO`'s requirement of "no stateful monads"?

I've looked over https://www.fpcomplete.com/blog/2017/06/tale-of-two-brackets, though skimming some parts, and I still don't quite understand the core issue "StateT is bad, IO is OK", other than vaguely getting the sense that Haskell allows one to write bad StateT monads (or in the ultimate example in the article, MonadBaseControl instead of StateT, I think).
In the haddocks, the following law must be satisfied:
askUnliftIO >>= (\u -> liftIO (unliftIO u m)) = m
So this appears to be saying that state is not mutated in the monad m when using askUnliftIO. But to my mind, in IO, the entire world can be the state. I could be reading and writing to a text file on disk, for instance.
To quote another article by Michael,
False purity We say WriterT and StateT are pure, and technically they
are. But let's be honest: if you have an application which is entirely
living within a StateT, you're not getting the benefits of restrained
mutation that you want from pure code. May as well call a spade a
spade, and accept that you have a mutable variable.
This makes me think this is indeed the case: with IO we are being honest, with StateT, we are not being honest about mutability ... but that seems another issue than what the law above is trying to show; after all, MonadUnliftIO is assuming IO. I'm having trouble understanding conceptually how IO is more restrictive than something else.
Update 1
After sleeping (some), I am still confused but am gradually getting less so as the day wears on. I worked out the law proof for IO. I realized the presence of id in the README. In particular,
instance MonadUnliftIO IO where
askUnliftIO = return (UnliftIO id)
So askUnliftIO would appear to return an IO (IO a) on an UnliftIO m.
Prelude> fooIO = print 5
Prelude> :t fooIO
fooIO :: IO ()
Prelude> let barIO :: IO(IO ()); barIO = return fooIO
Prelude> :t barIO
barIO :: IO (IO ())
Back to the law, it really appears to be saying that state is not mutated in the monad m when doing a round trip on the transformed monad (askUnliftIO), where the round trip is unLiftIO -> liftIO.
Resuming the example above, barIO :: IO (), so if we do barIO >>= (u -> liftIO (unliftIO u m)), then u :: IO () and unliftIO u == IO (), then liftIO (IO ()) == IO (). **So since everything has basically been applications of id under the hood, we can see that no state was changed, even though we are using IO. Crucially, I think, what is important is that the value in a is never run, nor is any other state modified, as a result of using askUnliftIO. If it did, then like in the case of randomIO :: IO a, we would not be able to get the same value had we not run askUnliftIO on it. (Verification attempt 1 below)
But, it still seems like we could do the same for other Monads, even if they do maintain state. But I also see how, for some monads, we may not be able to do so. Thinking of a contrived example: each time we access the value of type a contained in the stateful monad, some internal state is changed.
Verification attempt 1
> fooIO >> askUnliftIO
5
> fooIOunlift = fooIO >> askUnliftIO
> :t fooIOunlift
fooIOunlift :: IO (UnliftIO IO)
> fooIOunlift
5
Good so far, but confused about why the following occurs:
> fooIOunlift >>= (\u -> unliftIO u)
<interactive>:50:24: error:
* Couldn't match expected type `IO b'
with actual type `IO a0 -> IO a0'
* Probable cause: `unliftIO' is applied to too few arguments
In the expression: unliftIO u
In the second argument of `(>>=)', namely `(\ u -> unliftIO u)'
In the expression: fooIOunlift >>= (\ u -> unliftIO u)
* Relevant bindings include
it :: IO b (bound at <interactive>:50:1)

"StateT is bad, IO is OK"
That's not really the point of the article. The idea is that MonadBaseControl permits some confusing (and often undesirable) behaviors with stateful monad transformers in the presence of concurrency and exceptions.
finally :: StateT s IO a -> StateT s IO a -> StateT s IO a is a great example. If you use the "StateT is attaching a mutable variable of type s onto a monad m" metaphor, then you might expect that the finalizer action gets access to the most recent s value when an exception was thrown.
forkState :: StateT s IO a -> StateT s IO ThreadId is another one. You might expect that the state modifications from the input would be reflected in the original thread.
lol :: StateT Int IO [ThreadId]
lol = do
for [1..10] $ \i -> do
forkState $ modify (+i)
You might expect that lol could be rewritten (modulo performance) as modify (+ sum [1..10]). But that's not right. The implementation of forkState just passes the initial state to the forked thread, and then can never retrieve any state modifications. The easy/common understanding of StateT fails you here.
Instead, you have to adopt a more nuanced view of StateT s m a as "a transformer that provides a thread-local immutable variable of type s which is implicitly threaded through a computation, and it is possible to replace that local variable with a new value of the same type for future steps of the computation." (more or less a verbose english retelling of the s -> m (a, s)) With this understanding, the behavior of finally becomes a bit more clear: it's a local variable, so it does not survive exceptions. Likewise, forkState becomes more clear: it's a thread-local variable, so obviously a change to a different thread won't affect any others.
This is sometimes what you want. But it's usually not how people write code IRL and it often confuses people.
For a long time, the default choice in the ecosystem to do this "lowering" operation was MonadBaseControl, and this had a bunch of downsides: hella confusing types, difficult to implement instances, impossible to derive instances, sometimes confusing behavior. Not a great situation.
MonadUnliftIO restricts things to a simpler set of monad transformers, and is able to provide relatively simple types, derivable instances, and always predictable behavior. The cost is that ExceptT, StateT, etc transformers can't use it.
The underlying principle is: by restricting what is possible, we make it easier to understand what might happen. MonadBaseControl is extremely powerful and general, and quite difficult to use and confusing as a result. MonadUnliftIO is less powerful and general, but it's much easier to use.
So this appears to be saying that state is not mutated in the monad m when using askUnliftIO.
This isn't true - the law is stating that unliftIO shouldn't do anything with the monad transformer aside from lowering it into IO. Here's something that breaks that law:
newtype WithInt a = WithInt (ReaderT Int IO a)
deriving newtype (Functor, Applicative, Monad, MonadIO, MonadReader Int)
instance MonadUnliftIO WithInt where
askUnliftIO = pure (UnliftIO (\(WithInt readerAction) -> runReaderT 0 readerAction))
Let's verify that this breaks the law given: askUnliftIO >>= (\u -> liftIO (unliftIO u m)) = m.
test :: WithInt Int
test = do
int <- ask
print int
pure int
checkLaw :: WithInt ()
checkLaw = do
first <- test
second <- askUnliftIO >>= (\u -> liftIO (unliftIO u test))
when (first /= second) $
putStrLn "Law violation!!!"
The value returned by test and the askUnliftIO ... lowering/lifting are different, so the law is broken. Furthermore, the observed effects are different, which isn't great either.

Haskell MTL: How to exit the monad and get the value within it?

I know how to use functions from each monad inside a do block. But once i'm finished how do I run the computation and get the result?
run :: (MonadError Error m, MonadState State m) => m Int -> Int
run = ???

The whole point of mtl is that you don't specify your concrete monad transformer stack - you just specify that it has to handle errors and hold state.
However, once you actually want to run this action, you do need to tie yourself down to a particular monad transformer stack. That stack will inform how you can "run" your monadic action. In your case, it looks like you probably want to go with ExcepT Error (State State):
action :: (MonadError Error m, MonadState State m) => m Int
action = undefined
runAction :: State -> -- initial state
Either Error Int -- account for possibility of error
runAction initialState = evalState (runExceptT action) initialState
Note that this isn't the only choice of a concrete stack you could have made. In fact, you could even swap the order of the State and Except and choose StateT State (Either Error) as your monad!
runAction :: State -> -- initial state
Either Error Int -- account for possibility of error
runAction initialState = evalState action initialState
The fact that mtl says nothing about the order of monads in your stack is one of its shortcomings (and one of the reasons why some people prefer to just stick with transformers).

Haskell and calling function.

data S = Sa Int
type PMO = StateT Int (ErrorT String IO)
cs :: S -> PMO ()
cs _ = do
mem <- get -- (*)
return ()
I've highlighted the line with (*). I have doubts why get function can be called. I know that get is function from State monad but I cannot see/grasp how it is known that there is such monad in fact. I see that returned type is PMO but it is just returned value. What does it has in common with get?
Mainly, I have an experience with imperative language programming so it makes me trouble to understand it.

Check out the type of get:
get :: MonadState s m => m s
And since in the type signature of cs you've told the compiler what m is, get becomes:
get :: StateT Int (ErrorT String IO) Int
That's just a monadic value, not a function. So where does the value named mem come from? Ultimately, the value that get provides comes from the initial state value supplied by runStateT (or execStateT or evalStateT).
If that still seems mysterious, I recommend studying up on how the state monad works.

How to preserve information when failing?

I'm writing some code that uses the StateT monad transformer to keep track of some stateful information (logging and more).
The monad I'm passing to StateT is very simple:
data CheckerError a = Bad {errorMessage :: Log} | Good a
deriving (Eq, Show)
instance Monad CheckerError where
return x = Good x
fail msg = Bad msg
(Bad msg) >>= f = Bad msg
(Good x) >>= f = f x
type CheckerMonad a = StateT CheckerState CheckerError a
It's just a Left and Right variant.
What troubles me is the definition of fail. In my computation I produce a lot of information inside this monad and I'd like to keep this information even when failing.
Currently the only thing I can do is to convert everything to a String and create a Bad instance with the String passed as argument to fail.
What I'd like to do is something like:
fail msg = do
info <- getInfoOutOfTheComputation
return $ Bad info
However everything I tried until now gives type errors, probably because this would mix different monads.
Is there anyway in which I can implement fail in order to preserve the information I need without having to convert all of it into a String?
I cannot believe that the best Haskell can achieve is using show+read to pass all the information as the string to fail.

Your CheckerError monad is very similar to the Either monad. I will use the Either monad (and its monad transformer counterpart ErrorT) in my answer.
There is a subtlety with monad trasformers: order matters. Effects in the "inner" monad have primacy over effects caused by the "outer" layers. Consider these two alternative definitions of CheckerMonad:
import Control.Monad.State
import Control.Monad.Error
type CheckerState = Int -- dummy definitions for convenience
type CheckerError = String
type CheckerMonad a = StateT CheckerState (Either String) a
type CheckerMonad' a = ErrorT String (State CheckerState) a
In CheckerMonad, Either is the inner monad, and this means a failure will wipe the whole state. Notice the type of this run function:
runCM :: CheckerMonad a -> CheckerState -> Either CheckerError (a,CheckerState)
runCM m s = runStateT m s
You either fail, or return a result along with the state up to that point.
In CheckerMonad', State is the inner monad. This means the state will be preserved even in case of failures:
runCM' :: CheckerMonad' a -> CheckerState -> (Either CheckerError a,CheckerState)
runCM' m s = runState (runErrorT m) s
A pair is returned, which contains the state up to that point, and either a failure or a result.
It takes a bit of practice to develop an intuition of how to properly order monad transformers. The chart in the Type juggling section of this Wikibook page is a good starting point.
Also, it is better to avoid using fail directly, because it is considered a bit of a wart in the language. Instead, use the specialized functions for throwing errors provided by the error transformer. When working with ErrorT or some other instance of MonadError, use throwError.
sillycomp :: CheckerMonad' Bool
sillycomp = do
modify (+1)
s <- get
if s == 3
then throwError "boo"
else return True
*Main> runCM' sillycomp 2
Loading package transformers-0.3.0.0 ... linking ... done.
Loading package mtl-2.1.2 ... linking ... done.
(Left "boo",3)
*Main> runCM' sillycomp 3
(Right True,4)
ErrorT is sometimes annoying to use because, unlike Either, it requires an Error constraint on the error type. The Error typeclass forces you to define two error constructors noMsg and strMsg, which may or may not make sense for your type.
You can use EitherT from the either package instead, which lets you use any type whatsoever as the error. When working with EitherT, use the left function to throw errors.

How do you stack two ErrorT monad transformers on top of each other?

Let's say I have these two functions:
errorm :: ( MonadError String m ) => Bool -> m Int
errorm cond = if cond then return 1 else throwError "this is an error"
errorms :: ( MonadError String m ) => Bool -> m String
errorms cond = if cond then return "works" else throwError "does not work"
As you can see, one returns a string in the safe case, while the other returns an int
I now want to use them together within another monad. Trivially:
errErr :: MonadError String m => Bool -> Bool -> m (Int, String)
errErr b1 b2 = do
a <- errorm b1
b <- errorms b2
return (a,b)
The function signature here is derived by the GHC, and I am not sure how to use this function. I tried this:
runErrorT ( runErrorT ( errErr1 True True ) ) -- should give me (Right 1, Right "works")
But instead it gives me:
Ambiguous type variable `e0' in the constraint:
(Error e0) arising from a use of `errErr1'
Probable fix: add a type signature that fixes these type variable(s)
In the first argument of `runErrorT', namely `(errErr1 True True)'
In the first argument of `runErrorT', namely
`(runErrorT (errErr1 True True))'
In the expression: runErrorT (runErrorT (errErr1 True True))
In general, this is just one instance of my problem. I feel like I am not grasping how exactly to stack two monadT that are of the same class, but have different type parameters. Another example might be stacking the pair of a functions:
f :: ( MonadState Int m ) => m ()
g :: ( MonadState String m ) => m ()
---------------------------------------------------- Update ----------------------------------------------------
Per Daniel's comment below, I added a concrete instance of functions f and g from above. But thanks to Tikhon's answer, I think I figured it out.
type Counter = Int
type Msg = String
incr :: (MonadState Counter m) => Counter -> m ()
incr i = modify (+i)
addMsg :: ( MonadState Msg m ) => Msg -> m()
addMsg msg = modify ( ++ msg )
incrMsg:: (MonadTrans t, MonadState Msg m, MonadState Counter (t m)) => t m ()
incrMsg = do
lift . addMsg $ "one"
incr 1
return ()
incrMsgt = runIdentity $ runStateT ( runStateT incrMsg 1 ) "hello" :: (((), Int), String)

In this particular case, you do not need to stack two transformers--since both are MonadError String, they can be used as the same monad. You can use errorm and errorms together just like you would use two values in any other monad.
As a more concrete explanation, ignore the transformers for a second: you can imagine that the values are just Either String Int and Either String String. Clearly, you can just use them together. This is why you only need one runErrorT at the end rather than two: both values are in the same monad.
Now, what about your actual question: how do you stack two monad transformers? It works just like combining any two monad transformers. Two state transformers stacked upon each other look just like two different transformers stacked upon each other.
Now, it is a little tricky to use them. Depending on which one you're using, you will need to use lift differently. If you have a value in the base monad, you need to lift twice. If you have a value in the inner state monad, you will need to use it once. And if you have one in the outer level, you won't need it at all. This is just like normal transformers.
Going back to your error example, let's imagine you actually did want to stack two different error monad transformers instead of using them as one. This would mean that if you want to throw an error in the inner one, you would have to write lift (throwError "message"). If you had actually done this and had two stacked error transformers, than using runErrorT twice would have worked.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string