How to understand evalState in this State Monad Haskell code snippet? - haskell

I am looking at this compiler code snippet and do not understand what evalState does, being new to State Monad.
compileToAst :: FilePath -> String -> Either Errors (Contract (Check Type, Env, SourcePos))
compileToAst source code = case parse parser source code of
Right ast -> let ast' = evalState ast [globals]
errors = lefts $ map ann $ toList ast'
ann (a, _, pos) = a `extend` sourcePosPretty pos
in if null errors then Right ast' else Left errors
Left err -> Left [(SyntaxError $ parseErrorTextPretty err, sourcePosPretty . NE.head $ errorPos err)]
Assuming stateful computation is in the form of s -> (a, s),
ast is a monad, [globals] is s, and evalState ast [globals] returns type a. Where can I find the stateful computation definition transforming s to new s and yielding result a?

The function evalState has type:
evalState :: State s a -> s -> a
The type of the first argument, namely State s a, is actually isomorphic to the function type s -> (a, s). What this means formally is that there exist two functions that convert between them:
runState :: State s a -> (s -> (a, s))
state :: (s -> (a, s)) -> State s a
and if you apply one of these functions and then the other, you get back what you started with (i.e., they are inverses, and their composition is the identity function).
Less formally, it means that wherever you see State s a you can pretend it's the type s -> (a, s) and vice versa, since you can convert back and forth at will using these utility functions runState and state.
Therefore, all evalState does is take a first argument that's isomorphic to a stateful computation s -> (a, s) and runs it using an initial state given by its second argument. It then throws away the final state s and yields the final result of the computation.
Since it's the first argument to evalState that's the stateful computation, it's actually the ast returned when parse parser source code succeeds that's the stateful transformation s -> (a, s) you're looking for.
That is, the value ast has type:
ast :: State Env (Contract (Check Type, Env, SourcePos))
which is isomorphic to:
ast :: Env -> (Contract (Check Type, Env, SourcePos), Env)
so it's a stateful transformation that operates on a state consisting of an environment (list of symbol tables) and yields a contract. All evalState does is pass this stateful transformation an initial state/environment consisting of a singleton representing the global symbol table and then yields its final contract result (throwing away the final list of symbol tables, since it's no longer important once the contract is generated).
So, the way this compiler is designed, it compiles code into an "abstract syntax tree" that, instead of being a tree-like data structure, is actually a function giving a stateful transformation over an environment state that produces a contract; evalState just "runs" the transformation to generate the contract.

Related

confusion over the passing of State monad in Haskell

In Haskell the State is monad is passed around to extract and store state. And in the two following examples, both pass the State monad using >>, and a close verification (by function inlining and reduction) confirms that the state is indeed passed to the next step.
Yet this seems not very intuitive. So does this mean when I want to pass the State monad I just need >> (or the >>= and lambda expression \s -> a where s is not free in a)? Can anyone provide an intuitive explanation for this fact without bothering to reduce the function?
-- the first example
tick :: State Int Int
tick = get >>= \n ->
put (n+1) >>
return n
-- the second example
type GameValue = Int
type GameState = (Bool, Int)
playGame' :: String -> State GameState GameValue
playGame' [] = get >>= \(on, score) -> return score
playGame' (x: xs) = get >>= \(on, score) ->
case x of
'a' | on -> put (on, score+1)
'b' | on -> put (on, score-1)
'c' -> put (not on, score)
_ -> put (on, score)
>> playGame xs
Thanks a lot!
It really boils down to understanding that state is isomorphic to s -> (a, s). So any value "wrapped" in a monadic action is a result of applying a transformation to some state s (a stateful computation producing a).
Passing a state between two stateful computations
f :: a -> State s b
g :: b -> State s c
corresponds to composing them with >=>
f >=> g
or using >>=
\a -> f a >>= g
the result here is
a -> State s c
it is a stateful action that transforms some underlying state s in some way, it is allowed access to some a and it produces some c. So the entire transformation is allowed to depend on a and the value c is allowed to depend on some state s. This is exactly what you would want to express a stateful computation. The neat thing (and the sole purpose of expressing this machinery as a monad) is that you do not have to bother with passing the state around. But to understand how it is done, please refer to the definition of >>= on hackage), just ignore for a moment that it is a transformer rather than a final monad).
m >>= k = StateT $ \ s -> do
~(a, s') <- runStateT m s
runStateT (k a) s'
you can disregard the wrapping and unwrapping using StateT and runStateT, here m is in form s -> (a, s), k is of form a -> (s -> (b, s)), and you wish to produce a stateful transformation s -> (b, s). So the result is going to be a function of s, to produce b you can use k but you need a first, how do you produce a? you can take m and apply it to the state s, you get a modified state s' from the first monadic action m, and you pass that state into (k a) (which is of type s -> (b, s)). It is here that the state s has passed through m to become s' and be passed to k to become some final s''.
For you as a user of this mechanism, this remains hidden, and that is the neat thing about monads. If you want a state to evolve along some computation, you build your computation from small steps that you express as State-actions and you let do-notation or bind (>>=) to do the chaining/passing.
The sole difference between >>= and >> is that you either care or don't care about the non-state result.
a >> b
is in fact equivalent to
a >>= \_ -> b
so what ever value gets output by the action a, you throw it away (keeping only the modified state) and continue (pass the state along) with the other action b.
Regarding you examples
tick :: State Int Int
tick = get >>= \n ->
put (n+1) >>
return n
you can rewrite it in do-notation as
tick = do
n <- get
put (n + 1)
return n
while the first way of writing it makes it maybe more explicit what is passed how, the second way nicely shows how you do not have to care about it.
First get the current state and expose it (get :: s -> (s, s) in a simplified setting), the <- says that you do care about the value and you do not want to throw it away, the underlying state is also passed in the background without a change (that is how get works).
Then put :: s -> (s -> ((), s)), which is equivalent after dropping unnecessary parens to put :: s -> s -> ((), s), takes a value to replace the current state with (the first argument), and produces a stateful action whose result is the uninteresting value () which you drop (because you do not use <- or because you use >> instead of >>=). Due to put the underlying state has changed to n + 1 and as such it is passed on.
return does nothing to the underlying state, it only returns its argument.
To summarise, tick starts with some initial value s it updates it to s+1 internally and outputs s on the side.
The other example works exactly the same way, >> is only used there to throw away the () produced by put. But state gets passed around all the time.

How to print the result of a State Monad in Haskell?

Is it possible to print the result of a state monad in Haskell?
I'm trying to understand state monads and in a book I have been following supplies the code below for creating a state monad, but I am struggling with this topic as I am unable to view the process visually i.e. see the end result.
newtype State s a = State { runState :: s -> (a,s)}
instance Monad (State s) where
return x = State $ \s -> (x,s)
(State h) >>= f = State $ \s -> let (a, newState) = h s
(State g) = f a
in g newState
It is generally not possible to print functions in a meaningful way. If the domain of the function is small, you can import Data.Universe.Instances.Show from the universe-reverse-instances package to get a Show instance that prints a lookup table that is semantically equivalent to the function. With that module imported, you could simply add deriving Show to your newtype declaration to be able to print State actions over small state spaces.
The code you've supplied defines the kind of thing State s a is. And it also says that State s is a monad - that is, the kind of thing State s is conforms to the Monad typeclass/interface. This means you can bind one State s computation to another (as long as the type s is the same in each).
So your situation is analogous to that of someone who has defined the kind of thing that a Map is, and has also written code that says a Map conforms to such and such interfaces, but who doesn't have any maps, and hasn't yet run any computation with them. There's nothing to print then.
I take it you want to see the result of evaluating or executing your state actions, but you have not defined any actual state actions yet, nor have you called runState (or evalState or execState) on them. Don't forget you also need to supply an initial state to run the computation.
So maybe start by letting s and a be some particular types. E.g. let s be Int and let a be Int. Now you could go write some fns, e.g. f :: Int -> (Int, Int), and g :: Int -> (Int, Int). Maybe one function decrements the state, returning the new state and value, and another function increments the state, returning the new state and value. Then you could make a State Int Int out of f by wrapping it in the State constructor. And you could use >>= to chain as many state actions together as you like. Finally, you can use runState on this, to get the resulting value and resulting state, as long as you also supply an initial state (e.g. 0).
If it's just the result you want, and if you're just debugging:
import Debug.Trace
import Control.Monad.Trans.State
action :: State [Int] ()
action = do
put [0]
modify (1:)
modify (2:)
get >>= traceShowM
modify (3:)
modify (4:)
get >>= traceShowM

Signature of IO in Haskell (is this class or data?)

The question is not what IO does, but how is it defined, its signature. Specifically, is this data or class, is "a" its type parameter then? I didn't find it anywhere. Also, I don't understand the syntactic meaning of this:
f :: IO a
You asked whether IO a is a data type: it is. And you asked whether the a is its type parameter: it is. You said you couldn't find its definition. Let me show you how to find it:
localhost:~ gareth.rowlands$ ghci
GHCi, version 7.6.3: http://www.haskell.org/ghc/ :? for help
Prelude> :i IO
newtype IO a
= GHC.Types.IO (GHC.Prim.State# GHC.Prim.RealWorld
-> (# GHC.Prim.State# GHC.Prim.RealWorld, a #))
-- Defined in `GHC.Types'
instance Monad IO -- Defined in `GHC.Base'
instance Functor IO -- Defined in `GHC.Base'
Prelude>
In ghci, :i or :info tells you about a type. It shows the type declaration and where it's defined. You can see that IO is a Monad and a Functor too.
This technique is more useful on normal Haskell types - as others have noted, IO is magic in Haskell. In a typical Haskell type, the type signature is very revealing but the important thing to know about IO is not its type declaration, rather that IO actions actually perform IO. They do this in a pretty conventional way, typically by calling the underlying C or OS routine. For example, Haskell's putChar action might call C's putchar function.
IO is a polymorphic type (which happens to be an instance of Monad, irrelevant here).
Consider the humble list. If we were to write our own list of Ints, we might do this:
data IntList = Nil | Cons { listHead :: Int, listRest :: IntList }
If you then abstract over what element type it is, you get this:
data List a = Nil | Cons { listHead :: a, listRest :: List a }
As you can see, the return value of listRest is List a. List is a polymorphic type of kind * -> *, which is to say that it takes one type argument to create a concrete type.
In a similar way, IO is a polymorphic type with kind * -> *, which again means it takes one type argument. If you were to define it yourself, it might look like this:
data IO a = IO (RealWorld -> (a, RealWorld))
(definition courtesy of this answer)
The amount of magic in IO is grossly overestimated: it has some support from compiler and runtime system, but much less than newbies usually expect.
Here is the source file where it is defined:
http://www.haskell.org/ghc/docs/latest/html/libraries/ghc-prim-0.3.0.0/src/GHC-Types.html
newtype IO a
= IO (State# RealWorld -> (# State# RealWorld, a #))
It is just an optimized version of state monad. If we remove optimization annotations we will see:
data IO a = IO (Realworld -> (Realworld, a))
So basically IO a is a data structure storing a function that takes old real world and returns new real world with io operation performed and a.
Some compiler tricks are necessary mostly to remove Realworld dummy value efficiently.
IO type is an abstract newtype - constructors are not exported, so you cannot bypass library functions, work with it directly and perform nasty things: duplicate RealWorld, create RealWorld out of nothing or escape the monad (write a function of IO a -> a type).
Since IO can be applied to objects of any type a, as it is a polymorphic monad, a is not specified.
If you have some object with type a, then it can be 'wrappered' as an object of type IO a, which you can think of as being an action that gives an object of type a. For example, getChar is of type IO Char, and so when it is called, it has the side effect of (From the program's perspective) generating a character, which comes from stdin.
As another example, putChar has type Char -> IO (), meaning that it takes a char, and then performs some action that gives no output (in the context of the program, though it will print the char given to stdout).
Edit: More explanation of monads:
A monad can be thought of as a 'wrapper type' M, and has two associated functions:
return and >>=.
Given a type a, it is possible to create objects of type M a (IO a in the case of the IO monad), using the return function.
return, therefore, has type a -> M a. Moreover, return attempts not to change the element that it is passed -- if you call return x, you will get a wrappered version of x that contains all of the information of x (Theoretically, at least. This doesn't happen with, for example, the empty monad.)
For example, return "x" will yield an M Char. This is how getChar works -- it yields an IO Char using a return statement, which is then pulled out of its wrapper with <-.
>>=, read as 'bind', is more complicated. It has type M a -> (a -> M b) -> M b, and its role is to take a 'wrappered' object, and a function from the underlying type of that object to another 'wrappered' object, and apply that function to the underlying variable in the first input.
For example, (return 5) >>= (return . (+ 3)) will yield an M Int, which will be the same M Int that would be given by return 8. In this way, any function that can be applied outside of a monad can also be applied inside of it.
To do this, one could take an arbitrary function f :: a -> b, and give the new function g :: M a -> M b as follows:
g x = x >>= (return . f)
Now, for something to be a monad, these operations must also have certain relations -- their definitions as above aren't quite enough.
First: (return x) >>= f must be equivalent to f x. That is, it must be equivalent to perform an operation on x whether it is 'wrapped' in the monad or not.
Second: x >>= return must be equivalent to m. That is, if an object is unwrapped by bind, and then rewrapped by return, it must return to its same state, unchanged.
Third, and finally (x >>= f) >>= g must be equivalent to x >>= (\y -> (f y >>= g) ). That is, function binding is associative (sort of). More accurately, if two functions are bound successively, this must be equivalent to binding the combination thereof.
Now, while this is how monads work, it's not how it's most commonly used, because of the syntactic sugar of do and <-.
Essentially, do begins a long chain of binds, and each <- sort of creates a lambda function that gets bound.
For example,
a = do x <- something
y <- function x
return y
is equivalent to
a = something >>= (\x -> (function x) >>= (\y -> return y))
In both cases, something is bound to x, function x is bound to y, and then y is returned to a in the wrapper of the relevant monad.
Sorry for the wall of text, and I hope it explains something. If there's more you need cleared up about this, or something in this explanation is confusing, just ask.
This is a very good question, if you ask me. I remember being very confused about this too, maybe this will help...
'IO' is a type constructor, 'IO a' is a type, the 'a' (in 'IO a') is an type variable. The letter 'a' carries no significance, the letter 'b' or 't1' could have been used just as well.
If you look at the definition of the IO type constructor you will see that it is a newtype defined as: GHC.Types.IO (GHC.Prim.State# GHC.Prim.RealWorld -> (# GHC.Prim.State# GHC.Prim.RealWorld, a #))
'f :: IO a' is the type of a function called 'f' of apparently no arguments that returns a result of some unconstrained type in the IO monad. 'in the IO monad' means that f can do some IO (i.e. change the 'RealWorld', where 'change' means replace the provided RealWorld with a new one) while computing its result. The result of f is polymorphic (that's a type variable 'a' not a type constant like 'Int'). A polymorphic result means that in your program it's the caller that determines the type of the result, so used in one place f could return an Int, used in another place it could return a String. 'Unconstrained' means that there's no type class restricting what type can be returned and so any type can be returned.
Why is 'f' a function and not a constant since there are no parameters and Haskell is pure? Because the definition of IO means that 'f :: IO a' could have been written 'f :: GHC.Prim.State# GHC.Prim.RealWorld -> (# GHC.Prim.State# GHC.Prim.RealWorld, a #)' and so in fact has a parameter -- the 'state of the real world'.
In the data IO a a have mainly the same meaning as in Maybe a.
But we can't rid of a constructor, like:
fromIO :: IO a -> a
fromIO (IO a) = a
Fortunately we could use this data in Monads, like:
{-# LANGUAGE ScopedTypeVariables #-}
foo = do
(fromIO :: a) <- (dataIO :: IO a)
...

Meaning of a newtype statement

I have this statement:
newtype State st a = State (st -> (st, a))
Hence the type of State is:
State :: (st -> (st, a)) -> State st a
I cannot understand the meaning:
Are st and a just placeholder of two data-type? Right?
Does the statement means that State is a function that take as argument a function?
Yes. Data constructors are functions in Haskell, with the additional feature that you can pattern match against them. So, for example, if you have list of type fs : [st -> (st, a)] you can do map State fs :: [State st a].
The way the state monad works conventionally is that State st a represents a state transformer: a thing that takes an initial state, performs some computation that may depend or alter that state, and produces a result of type a. Composing two state transformers means creating a composite one that executes the first one with the initial state, and then executes the second one with the state that holds after the first one executes.
So the State monad implementation models that directly as a function of type st -> (st, a). Composing two such functions is just a matter of generating a composite function that feeds the initial state to the first one, passes the state that results from that to the second one, and returns the final state and result of the second one. In code:
bindState :: State st a -> (a -> State st b) -> State st b
bindState (State function1) f =
State $ \initialState -> let (nextState, firstResult) = function1 initialState
in f firstResult
Yes and yes. The st is the state type and the a is the answer type.

Confusion over the State Monad code on "Learn you a Haskell"

I am trying to get a grasp on Haskell using the online book Learn you a Haskell for great Good.
I have, to my knowledge, been able to understand Monads so far until I hit the chapter introducing the State Monad.
However, the code presented and claimed to be the Monad implementation of the State type (I have not been able to locate it in Hoogle) seems too much for me to handle.
To begin with, I do not understand the logic behind it i.e why it should work and how the author considered this technique.( maybe relevant articles or white-papers can be suggested?)
At line 4, it is suggested that function f takes 1 parameter.
However a few lines down we are presented with pop, which takes no parameters!
To extend on point 1, what is the author trying to accomplish using a function to represent the State.
Any help in understanding what is going on is greatly appreciated.
Edit
To whom it may concern,
The answers below cover my question thoroughly.
One thing I would like to add though:
After reading an article suggested below, I found the answer to my second point above:
All that time I assumed that the pop function would be used like :
stuff >>= pop since in the bind type the second parameter is the function, whereas the correct usage is this pop >>= stuff , which I realized after reading again how do-notation translates to plain bind-lambdas.
The State monad represents stateful computations i.e. computations that use values from, and perhaps modify, some external state. When you sequence stateful computations together, the later computations might give different results depending on how the previous computations modified the state.
Since functions in Haskell must be pure (i.e. have no side effects) we simulate the effect of external state by demanding that every function takes an additional parameter representing the current state of the world, and returns an additional value, representing the modified state. In effect, the external state is threaded through a sequence of computations, as in this abomination of a diagram that I just drew in MSPaint:
Notice how each box (representing a computation) has one input and two outputs.
If you look at the Monad instance for State you see that the definition of (>>=) tells you how to do this threading. It says that to bind a stateful computation c0 to a function f that takes results of a stateful computation and returns another stateful computation, we do the following:
Run c0 using the initial state s0 to get a result and a new state: (val, s1)
Feed val to the function f to get a new stateful computation, c1
Run the new computation c1 with the modified state s1
How does this work with functions that already take n arguments? Because every function in Haskell is curried by default, we simply tack an extra argument (for the state) onto the end, and instead of the normal return value, the function now returns a pair whose second element is the newly modified state. So instead of
f :: a -> b
we now have
f :: a -> s -> (b, s)
You might choose to think of as
f :: a -> ( s -> (b, s) )
which is the same thing in Haskell (since function composition is right associative) which reads "f is a function which takes an argument of type a and returns a stateful computation". And that's really all there is to the State monad.
Short answer:
State is meant to exploit monads' features in order to simulate an imperative-like system state with local variables. The basic idea is to hide within the monad the activity of taking in the current state and returning the new state together with an intermediate result at each step (and here we have s -> (a,s).
Do not mistake arbitrary functions with those wrapped within the State. The former may have whatever type you want (provided that they eventually produce some State a if you want to use them in the state monad). The latter holds functions of type s -> (a,s): this is the state-passing layer managed by the monad.
As I said, the function wrapped within State is actually produced by means of (>>=) and return as they're defined for the Monad (State s) instance. Its role is to pass down the state through the calls of your code.
Point 3 also is the reason why the state parameter disappears from the functions actually used in the state monad.
Long answer:
The State Monad has been studied in different papers, and exists also in the Haskell framework (I don't remember the good references right now, I'll add them asap).
This is the idea that it follows: consider a type data MyState = ... whose values holds the current state of the system.
If you want to pass it down through a bunch of functions, you should write every function in such a way that it takes at least the current state as a parameter and returns you a pair with its result (depending on the state and the other input parameters) and the new (possibly modified) state. Well, this is exactly what the type of the state monad tells you: s -> (a, s). In our example, s is MyState and is meant to pass down the state of the system.
The function wrapped in the State does not take parameters except from the current state, which is needed to produce as a result the new state and an intermediate result. The functions with more parameters that you saw in the examples are not a problem, because when you use them in the do-notation within the monad, you'll apply them to all the "extra" needed parameters, meaning that each of them will result in a partially applied function whose unique remaining parameter is the state; the monad instance for State will do the rest.
If you look at the type of the functions (actually, within monads they are usually called actions) that may be used in the monad, you'll see that they result type is boxed within the monad: this is the point that tells you that once you give them all the parameters, they will actually do not return you the result, but (in this case) a function s -> (a,s) that will fit within the monad's composition laws.
The computation will be executed by passing to the whole block/composition the first/initial state of the system.
Finally, functions that do not take parameters will have a type like State a where a is their return type: if you have a look at the value constructor for State, you'll see again that this is actually a function s -> (a,s).
I'm total newbie to Haskell and I couldn't understand well the State Monad code in that book, too. But let me add my answer here to help someone in the future.
Answers:
What are they trying to accomplish with State Monad ?
Composing functions which handle stateful computation.
e.g. push 3 >>= \_ -> push 5 >>= \_ -> pop
Why pop takes no parameters, while it is suggested function f takes 1 parameter ?
pop takes no arguments because it is wrapped by State.
unwapped function which type is s -> (a, s) takes one argument. the
same goes for push.
you can unwrapp with runState.
runState pop :: Stack -> (Int, Stack)
runState (push 3) :: Stack -> ((), Stack)
if you mean the right-hand side of the >>= by the "function f", the f will be like \a -> pop or \a -> push 3, not just pop.
Long Explanation:
These 3 things helped me to understand State Monad and the Stack example a little more.
Consider the types of the arguments for bind operator(>>=)
The definition of the bind operator in Monad typeclass is this
(>>=) :: (Monad m) => m a -> (a -> m b) -> m b
In the Stack example, m is State Stack.
If we mentaly replace m with State Stack, the definition can be like this.
(>>=) :: State Stack a -> (a -> State Stack b) -> State Stack b
Therefore, the type of left side argument for the bind operator will be State Stack a.
And that of right side will be a -> State Stack b.
Translate do notation to bind operator
Here is the example code using do notation in the book.
stackManip :: State Stack Int
stackManip = do
push 3
pop
pop
it can be translated to the following code with bind operator.
stackManip :: State Stack Int
stackManip = push 3 >>= \_ -> pop >>= \_ -> pop
Now we can see what will be the right-hand side for the bind operator.
Their types are a -> State Stack b.
(\_ -> pop) :: a -> State Stack Int
(\_ -> push 3) :: a -> State Stack ()
Recognize the difference between (State s) and (State h) in the instance declaration
Here is the instance declaration for State in the book.
instance Monad (State s) where
return x = State $ \s -> (x,s)
(State h) >>= f = State $ \s -> let (a, newState) = h s
(State g) = f a
in g newState
Considering the types with the Stack example, the type of (State s) will be
(State s) :: State Stack
s :: Stack
And the type of (State h) will be
(State h) :: State Stack a
h :: Stack -> (a, Stack)
(State h) is the left-hand side argument of the bind operator and its type is State Stack a as described above.
Then why h becomes Stack -> (a, Stack) ?
It is the result of pattern matching against the State value constructor which is defined in the newtype wrapper. The same goes for the (State g).
newtype State s a = State { runState :: s -> (a,s) }
In general, type of h is s ->(a, s), representation of the stateful computation. Any of followings could be the h in the Stack example.
runState pop :: Stack -> (Int, Stack)
runState (push 3) :: Stack -> ((), Stack)
runState stackManip :: Stack -> (Int, Stack)
that's it.
The State monad is essentially
type State s a = s -> (a,s)
a function from one state (s) to a pair of the desired result (a) and a new state. The implementation makes the threading of the state implicit and handles the state-passing and updating for you, so there's no risk of accidentally passing the wrong state to the next function.
Thus a function that takes k > 0 arguments, one of which is a state and returns a pair of something and a new state, in the State s monad becomes a function taking k-1 arguments and returning a monadic action (which basically is a function taking one argument, the state, here).
In the non-State setting, pop takes one argument, the stack, which is the state. So in the monadic setting, pop becomes a State Stack Int action taking no explicit argument.
Using the State monad instead of explicit state-passing makes for cleaner code with fewer opportunities for error, that's what the State monad accomplishes. Everything could be done without it, it would just be more cumbersome and error-prone.

Resources