Haskell and calling function. - haskell

data S = Sa Int
type PMO = StateT Int (ErrorT String IO)
cs :: S -> PMO ()
cs _ = do
mem <- get -- (*)
return ()
I've highlighted the line with (*). I have doubts why get function can be called. I know that get is function from State monad but I cannot see/grasp how it is known that there is such monad in fact. I see that returned type is PMO but it is just returned value. What does it has in common with get?
Mainly, I have an experience with imperative language programming so it makes me trouble to understand it.

Check out the type of get:
get :: MonadState s m => m s
And since in the type signature of cs you've told the compiler what m is, get becomes:
get :: StateT Int (ErrorT String IO) Int
That's just a monadic value, not a function. So where does the value named mem come from? Ultimately, the value that get provides comes from the initial state value supplied by runStateT (or execStateT or evalStateT).
If that still seems mysterious, I recommend studying up on how the state monad works.

Related

Why is the output of this Haskell function an IO String instead of a String

I'm learning Haskell through learnyouahaskell.com and wanted to test some of the concepts before finishing the input/output module. I haven't been able to google or hoogle my way out of this question, though, even though it seems quite simple.
When I try to run the following code
getName = do
name <- getLine
return name
the output of getName becomes an element of type IO String instead of String, even though name is definitely a String
By reading the documentation and other StackVverflow's questions I couldn't figure out why is this happening when I declare getName as a function (when I use the bind <- operation directly on main there's no problem whatsoever).
The return function is not conceptually the same as what return does in languages like C++, Java and Python. return :: Monad m => a -> m a takes an a (here a String), and produces an m a (here IO a).
The do notation is syntacticual sugar. If we desugar the statement, you wrote:
getName = getLine >>= (\name -> return name)
or cleaner:
getName = getLine >>= return
The bind function (>>=) :: Monad m => m a -> (a -> m b) -> m b thus has as first operand an m a, and as second a function a -> m b, and produces an m b. Since getLine :: IO String is an IO String, that thus means that m is the same as IO, and a is the same as String. The return :: Monad m => a -> m a, makes it clear that here b is the same as a.
Then what is IO here. A metaphor that is frequently used is the one of a recipe. In this metaphor an IO a is a set of instructions that when you follow these, you will get an a. But that does not mean that that recipe is an a.
(>>=) here basically says that, on the left hand I have a recipe to make a, on the right hand I have a function that converts that a into a recipe to make b, so we can construct a recipe to make b with these two.
People often ask how to unwrap an a out of an IO a, but conceptually it makes not much sense. You can not "unwrap" the cake out of a recipe to make a cake. You can follow the instructions to make a cake. Following instructions is something the main will eventually do. We thus can construct a (long) recipe the main will do. But we can not unwrap the values.
Strictly speaking there is a function unsafePerformIO :: IO a -> a that can do that. But it is strongly adviced not to use that. Functions in Haskell are supposed to be pure that means that for the same input, we always retrieve the same output. getLine itself is a pure, since it always produces the same recipe (IO String).

How to understand `MonadUnliftIO`'s requirement of "no stateful monads"?

I've looked over https://www.fpcomplete.com/blog/2017/06/tale-of-two-brackets, though skimming some parts, and I still don't quite understand the core issue "StateT is bad, IO is OK", other than vaguely getting the sense that Haskell allows one to write bad StateT monads (or in the ultimate example in the article, MonadBaseControl instead of StateT, I think).
In the haddocks, the following law must be satisfied:
askUnliftIO >>= (\u -> liftIO (unliftIO u m)) = m
So this appears to be saying that state is not mutated in the monad m when using askUnliftIO. But to my mind, in IO, the entire world can be the state. I could be reading and writing to a text file on disk, for instance.
To quote another article by Michael,
False purity We say WriterT and StateT are pure, and technically they
are. But let's be honest: if you have an application which is entirely
living within a StateT, you're not getting the benefits of restrained
mutation that you want from pure code. May as well call a spade a
spade, and accept that you have a mutable variable.
This makes me think this is indeed the case: with IO we are being honest, with StateT, we are not being honest about mutability ... but that seems another issue than what the law above is trying to show; after all, MonadUnliftIO is assuming IO. I'm having trouble understanding conceptually how IO is more restrictive than something else.
Update 1
After sleeping (some), I am still confused but am gradually getting less so as the day wears on. I worked out the law proof for IO. I realized the presence of id in the README. In particular,
instance MonadUnliftIO IO where
askUnliftIO = return (UnliftIO id)
So askUnliftIO would appear to return an IO (IO a) on an UnliftIO m.
Prelude> fooIO = print 5
Prelude> :t fooIO
fooIO :: IO ()
Prelude> let barIO :: IO(IO ()); barIO = return fooIO
Prelude> :t barIO
barIO :: IO (IO ())
Back to the law, it really appears to be saying that state is not mutated in the monad m when doing a round trip on the transformed monad (askUnliftIO), where the round trip is unLiftIO -> liftIO.
Resuming the example above, barIO :: IO (), so if we do barIO >>= (u -> liftIO (unliftIO u m)), then u :: IO () and unliftIO u == IO (), then liftIO (IO ()) == IO (). **So since everything has basically been applications of id under the hood, we can see that no state was changed, even though we are using IO. Crucially, I think, what is important is that the value in a is never run, nor is any other state modified, as a result of using askUnliftIO. If it did, then like in the case of randomIO :: IO a, we would not be able to get the same value had we not run askUnliftIO on it. (Verification attempt 1 below)
But, it still seems like we could do the same for other Monads, even if they do maintain state. But I also see how, for some monads, we may not be able to do so. Thinking of a contrived example: each time we access the value of type a contained in the stateful monad, some internal state is changed.
Verification attempt 1
> fooIO >> askUnliftIO
5
> fooIOunlift = fooIO >> askUnliftIO
> :t fooIOunlift
fooIOunlift :: IO (UnliftIO IO)
> fooIOunlift
5
Good so far, but confused about why the following occurs:
> fooIOunlift >>= (\u -> unliftIO u)
<interactive>:50:24: error:
* Couldn't match expected type `IO b'
with actual type `IO a0 -> IO a0'
* Probable cause: `unliftIO' is applied to too few arguments
In the expression: unliftIO u
In the second argument of `(>>=)', namely `(\ u -> unliftIO u)'
In the expression: fooIOunlift >>= (\ u -> unliftIO u)
* Relevant bindings include
it :: IO b (bound at <interactive>:50:1)
"StateT is bad, IO is OK"
That's not really the point of the article. The idea is that MonadBaseControl permits some confusing (and often undesirable) behaviors with stateful monad transformers in the presence of concurrency and exceptions.
finally :: StateT s IO a -> StateT s IO a -> StateT s IO a is a great example. If you use the "StateT is attaching a mutable variable of type s onto a monad m" metaphor, then you might expect that the finalizer action gets access to the most recent s value when an exception was thrown.
forkState :: StateT s IO a -> StateT s IO ThreadId is another one. You might expect that the state modifications from the input would be reflected in the original thread.
lol :: StateT Int IO [ThreadId]
lol = do
for [1..10] $ \i -> do
forkState $ modify (+i)
You might expect that lol could be rewritten (modulo performance) as modify (+ sum [1..10]). But that's not right. The implementation of forkState just passes the initial state to the forked thread, and then can never retrieve any state modifications. The easy/common understanding of StateT fails you here.
Instead, you have to adopt a more nuanced view of StateT s m a as "a transformer that provides a thread-local immutable variable of type s which is implicitly threaded through a computation, and it is possible to replace that local variable with a new value of the same type for future steps of the computation." (more or less a verbose english retelling of the s -> m (a, s)) With this understanding, the behavior of finally becomes a bit more clear: it's a local variable, so it does not survive exceptions. Likewise, forkState becomes more clear: it's a thread-local variable, so obviously a change to a different thread won't affect any others.
This is sometimes what you want. But it's usually not how people write code IRL and it often confuses people.
For a long time, the default choice in the ecosystem to do this "lowering" operation was MonadBaseControl, and this had a bunch of downsides: hella confusing types, difficult to implement instances, impossible to derive instances, sometimes confusing behavior. Not a great situation.
MonadUnliftIO restricts things to a simpler set of monad transformers, and is able to provide relatively simple types, derivable instances, and always predictable behavior. The cost is that ExceptT, StateT, etc transformers can't use it.
The underlying principle is: by restricting what is possible, we make it easier to understand what might happen. MonadBaseControl is extremely powerful and general, and quite difficult to use and confusing as a result. MonadUnliftIO is less powerful and general, but it's much easier to use.
So this appears to be saying that state is not mutated in the monad m when using askUnliftIO.
This isn't true - the law is stating that unliftIO shouldn't do anything with the monad transformer aside from lowering it into IO. Here's something that breaks that law:
newtype WithInt a = WithInt (ReaderT Int IO a)
deriving newtype (Functor, Applicative, Monad, MonadIO, MonadReader Int)
instance MonadUnliftIO WithInt where
askUnliftIO = pure (UnliftIO (\(WithInt readerAction) -> runReaderT 0 readerAction))
Let's verify that this breaks the law given: askUnliftIO >>= (\u -> liftIO (unliftIO u m)) = m.
test :: WithInt Int
test = do
int <- ask
print int
pure int
checkLaw :: WithInt ()
checkLaw = do
first <- test
second <- askUnliftIO >>= (\u -> liftIO (unliftIO u test))
when (first /= second) $
putStrLn "Law violation!!!"
The value returned by test and the askUnliftIO ... lowering/lifting are different, so the law is broken. Furthermore, the observed effects are different, which isn't great either.

Type inference seems like a magic

I have following code snippet and could not configure it out, how it works:
embedded :: MaybeT (ExceptT String (ReaderT () IO)) Int
embedded = return 1
How it is possible to give only a number and get such as type signature back? How does the compiler do that?
The choice of wording is a bit unfortunate. It's not the case that the expression return 1 gives back the type signature MaybeT (ExceptT String (ReaderT () IO)) Int.
As n.m. writes in the comments, if you don't supply a type, the expression is much more general:
Prelude> embedded = return 1
Prelude> :type embedded
embedded :: (Num a, Monad m) => m a
By annotating with a type, you explicitly state that you want something less general than that.
Specifically, you state that you want the type MaybeT (ExceptT String (ReaderT () IO)) Int.
How does return work? MaybeT m a is a Monad when m is a Monad, and return is defined like this:
return = lift . return
The right-hand return is the return function that belongs to the 'inner' Monad, whereas lift is defined by MonadTrans and lifts the underlying monadic value up to MaybeT.
That explains how a MaybeT value is created, but isn't the whole story.
In this case, the 'inner' Monad is ExceptT String (ReaderT () IO), which is another Monad (in fact, another MonadTrans). return is defined like this:
return a = ExceptT $ return (Right a)
Notice that this is another nested return, where the right-hand return belongs to yet another nested Monad.
In this case, the nested Monad is ReaderT () IO - another MonadTrans. It defines return like this:
return = lift . return
Yet another nested return, where the right-hand return is the return defined for IO (in this particular case).
All of this is parametrised with a, which in this case you've constrained to Int.
So return 1 first takes the pure value 1 and packages it in IO Int. This then gets lifted to ReaderT () IO Int, which again gets packaged into an ExceptT String (ReaderT () IO) Int. Finally, this values gets lifted to MaybeT.

STRef and phantom types

Does s in STRef s a get instantiated with a concrete type? One could easily imagine some code where STRef is used in a context where the a takes on Int. But there doesn't seem to be anything for the type inference to give s a concrete type.
Imagine something in pseudo Java like MyList<S, A>. Even if S never appeared in the implementation of MyList instantiating a concrete type like MyList<S, Integer> where a concrete type is not used in place of S would not make sense. So how can STRef s a work?
tl;dr - in practice it seems it always gets initialised to RealWorld in the end
The source notes that s can be instantiated to RealWorld inside invocations of stToIO, but is otherwise uninstantiated:
-- The s parameter is either
-- an uninstantiated type variable (inside invocations of 'runST'), or
-- 'RealWorld' (inside invocations of 'Control.Monad.ST.stToIO').
Looking at the actual code for ST however it seems runST uses a specific value realWorld#:
newtype ST s a = ST (STRep s a)
type STRep s a = State# s -> (# State# s, a #)
runST :: (forall s. ST s a) -> a
runST st = runSTRep (case st of { ST st_rep -> st_rep })
runSTRep :: (forall s. STRep s a) -> a
runSTRep st_rep = case st_rep realWorld# of
(# _, r #) -> r
realWorld# is defined as a magic primitive inside the GHC source code:
realWorldName = mkWiredInIdName gHC_PRIM (fsLit "realWorld#")
realWorldPrimIdKey realWorldPrimId
realWorldPrimId :: Id -- :: State# RealWorld
realWorldPrimId = pcMiscPrelId realWorldName realWorldStatePrimTy
(noCafIdInfo `setUnfoldingInfo` evaldUnfolding
`setOneShotInfo` stateHackOneShot)
You can also confirm this in ghci:
Prelude> :set -XMagicHash
Prelude> :m +GHC.Prim
Prelude GHC.Prim> :t realWorld#
realWorld# :: State# RealWorld
From your question I can not see if you understand why the phantom s type is there at all. Even if you did not ask for this explicitly, let me elaborate on that.
The role of the phantom type
The main use of the phantom type is to constrain references (aka pointers) to stay "inside" the ST monad. Roughly, the dynamically allocated data must end its life when runST returns.
To see the issue, let's pretend that the type of runST were
runST :: ST s a -> a
Then, consider this:
data Dummy
let var :: STRef Dummy Int
var = runST (newSTRef 0)
change :: () -> ()
change = runST (modifySTRef var succ)
access :: () -> Int
result :: (Int, ())
result = (access() , change())
in result
(Above I added a few useless () arguments to make it similar to imperative code)
Now what should be the result of the code above? It could be either (0,()) or (1,()) depending on the evaluation order. This is a big no-no in the pure Haskell world.
The issue here is that var is a reference which "escaped" from its runST. When you escape the ST monad, you are no longer forced to use the monad operator >>= (or equivalently, the do notation to sequentialize the order of side effects. If references are still around, then we can still have side effects around when there should be none.
To avoid the issue, we restrict runST to work on ST s a where a does not depend on s. Why this? Because newSTRef returns a STRef s a, a reference to a tagged with the phantom type s, hence the return type depends on s and can not be extracted from the ST monad through runST.
Technically, this restriction is done by using a rank-2 type:
runST :: (forall s. ST s a) -> a
the "forall" here is used to implement the restriction. The type is saying: choose any a you wish, then provide a value of type ST s a for any s I wish, then I will return an a. Mind that s is chosen by runST, not by the caller, so it could be absolutely anything. So, the type system will accept an application runST action only if action :: forall s. ST s a where s is unconstrained, and a does not involve s (recall that the caller has to choose a before runST chooses s).
It is indeed a slightly hackish trick to implement the independence constraint, but it does work.
On the actual question
To connect this to your actual question: in the implementation of runST, s will be chosen to be any concrete type. Note that, even if s were simply chosen to be Int inside runST it would not matter much, because the type system has already constrained a to be independent from s, hence to be reference-free. As #Ganesh pointed out, RealWorld is the type used by GHC.
You also mentioned Java. One could attempt to play a similar trick in Java as follows: (warning, overly simplified code follows)
interface ST<S,A> { A call(); }
interface STAction<A> { <S> ST<S,A> call(S dummy); }
...
<A> A runST(STAction<A> action} {
RealWorld dummy = new RealWorld();
return action.call(dummy).call();
}
Above in STAction parameter A can not depend on S.

How do you reason about the order of execution of functions in a monadT stack?

General theme: While I find the idea of stacking monads together is very appealing, I am having a lot of trouble picturing how the code is executed, and what are the appropriate orders to run the layers. Below is one example of a stack: Writer, State, State, and Error, in no particular order ( or is there? ).
-----------------------
-- Utility Functions --
-----------------------
type Memory = Map String Int
type Counter = Int
type Log = String
tick :: (MonadState Counter m) => m ()
tick = modify (+1)
record :: (MonadWriter Log m) => Log -> m ()
record msg = tell $ msg ++ "; "
------------------
-- MonadT Stack --
------------------
mStack :: ( MonadTrans t, MonadState Memory m, MonadState Counter (t m), MonadError ErrMsg (t m), MonadWriter Log (t m) ) => t m Int
mStack = do
tick
m <- lift get
let x = fromJust ( M.lookup "x" m ) in x
record "accessed memory"
case True of
True -> return 100
False -> throwError "false"
Please note in mStack, whether an error is thrown or not has nothing to do with any other part of the function.
Now ideally I want the output to look like this:
( Right 100, 1, "accessed memory", fromList [...])
or in general:
( output of errorT, output of stateT Counter, output of writerT, output of StateT Memory )
But I cannot get it to work. Specifically, I tried running the stack as if Error is on the outermost layer:
mem1 = M.fromList [("x",10),("y",5)]
runIdentity $ runWriterT (runStateT (runStateT (runErrorT mStack ) 0 ) mem1 ) ""
But am getting this error message:
Couldn't match type `Int' with `Map [Char] Int'
The above instance aside, in general, when I am calling:
runMonadT_1 ( runMonadT_2 expr param2 ) param1,
are the functions relating to monadT_2 run first, then that output is piped into the functions relating to monadT_1 ? So in other words, as imperative as the code looks in the above function mStack, is the order of execution entirely dependent upon the order in which the monadT are run ( aside from any rigidness in structure introduced by lift ) ?
You would have gotten a more informative type error if you had tried to type your computation using an explicit monad transformer stack:
mStack :: ErrorT String (StateT (Map String Int) (StateT Int Writer)) Int
Had you done that, ghc would have caught the type error earlier. The reason is that you use the following two commands within mStack at the top-most level:
modify (+1) -- i.e. from `tick`
...
yourMap <- lift get
If you were to give this an explicit stack, then you'd catch the mistake: both modify and lift get are going to target the first StateT layer they encounter, which happens to be the same StateT layer.
modify begins from the ErrorT layer and proceeds downward until it hits the outer StateT layer, and concludes that the outer StateT must be using an Int state. get begins from the outer StateT layer, notices that it's already in a StateT layer and ignores the inner StateT layer entirely, so it concludes that the outer StateT layer must be storing a Map.
ghc then says "What gives? This layer can't be storing both an Int and a Map!", which explains the type error you got. However, because you used type classes instead of a concrete monad transformer stack, there was no way that ghc could know that this was a type error in waiting until you specified a concrete stack.
The solution is simple: just add another lift to your get and it will now target the inner StateT layer like you intended.
I personally prefer to avoid mtl classes entirely and always work with a concrete monad transformer stack using the transformers library alone. It's more verbose because you have to be precise about which layer you want using lift, but it causes fewer headaches down the road.

Resources