Just learning about State monad from this excellent tutorial. However, when I tried to explain it to a non-programmer they had a question that stumped me.
If the purpose of the State is to simulate mutable memory, why is the function that state monad stores is of the type:
s -> (a, s)
and not simply:
s -> s
In other words, what is the need for the "intermediate" value? For example, couldn't we, in the cases where we need it, simulate it by simply defining a state as a tuple of (state, value)?
I'm sure I confused something, any help is appreciated.
To draw a parallel with an imperative language like C, s -> s corresponds to a function with the return type void, which is invoked purely for side effects (such as mutating the memory). It is isomorphic to State s ().
And indeed, it is possible to write C functions which communicate only through global variables. But, as in C, it is often convenient to return values from functions. That's what a is for.
Of course it's possible that for your particular problem s -> s is a better choice. Although it's not a Monad, it is a Monoid (when wrapped in Endo). So you can construct such functions using <> and mempty, which correspond to >>= and return of Monad.
To expand a bit on Nick's answer:
s is the state. If all your functions were s -> s (state to state), your functions would not be able to return any values. You could define your state as (the actual state, value returned), but that conflates the state with the value the state-ful functions are computing. And it's also the common case that you'll want functions to actually compute and return values...
s' -> s' is equivalent to (a, s) -> (a, s). Here it is obvious that your State will need an initial a to start things off in addition to s.
On the other hand s -> (a, s) only needs the seed s to begin things and does not require an a value at all.
Thus the type of s -> (a, s) tells you that State is less complex than if it were (a, s) -> (a, s). Types in Haskell convey LOTS of information.
If the purpose of the State is to simulate mutable memory, why is the function that state monad stores is of the type:
s -> (a, s)
and not simply:
s -> s
The purpose of the State monad is not to simulate mutable memory, but rather to model computations that both produce a value and have a side effect. Simply, given some initial state of type s, your computation will produce some value of type a, as well as an updated state.
Maybe your computation does not produce a value... Then, easy: the value type a is simply (). Perhaps on the other hand your computation does not have a side effect. Again, easy: you might think of your state transition function (the s -> s argument to modify) as just being id. But often you're dealing with both at the same time.
You can actually use get and put as relatively simple examples:
get :: State s s -- s -> (s, s)
put :: s -> State () -- s -> (s -> ((), s))
get is a computation which, given the current state (the first s), will return it both as a value -- that is, the result of the computation -- and as the "new" (unmodified) state.
put is a computation which, given a new state (the first s) and a current state (the second s), will simply ignore the current state. It will produce () as the computed value (because, of course, it hasn't computed any value!) and hang onto the new state provided.
Presumably you want to use your stateful computations inside of do notation?
You should ask yourself what the Monad instance would look like for a stateful computation defined by
newtype State s = { runState :: s -> s }
The problem to be solved is that you have an input and a series of functions, and you want to apply the functions to the input in order.
If the functions are purely state-changing functions, s -> s on an input of type s, then you don't need State to use them. Haskell is very good at chaining together functions like these, e.g. with the standard composition operator ., or something like foldr (.) id, or foldr id.
However, if the functions both mutate a state and report some result of doing so, so that you could give them the type s -> (s,a), then gluing them all together is a bit of a nuisance. You have to unpack the result tuple and pass the new state value to the next function, use the reported value somewhere else, and then unpack that result, and so on. It's easy to pass the wrong state to an input function because you have to name each result and input explicitly to do the unpacking. You end up with something like this:
let
(res1, s1) = fun1 s0
(res2, s2) = fun2 s1
(res3, s3) = fun3 res1 res2 s1
...
in resN
There, I accidentally passed s1 instead of s2, maybe because I added the second line in later and didn't realise the third line needed changing. When composing the s -> s functions, this problem can't possibly arise because there are no names to get right:
let
resN = fun1 . fun2 . fun3 . -- etc.
So we invented State to do the same trick. State is essentially just a way of gluing functions like s -> (s,a) together in such a way that the right state always gets passed to the right function.
So it's not so much that people went "we want to use State, let's use s -> (s,a)" but rather "we're writing functions like s -> (s,a), let's invent State to make that easy". With functions s -> s, it's already easy and we don't have to invent anything.
As an example of how s -> (s,a) arises naturally, consider parsing: a parser will be given some input, consume some of the input and return a value. In Haskell, this is naturally modelled as taking an input list, and returning a pair of the value and the remaining input - i.e. [Input] -> ([Input], a), or State [Input].
a is the value returned, and the s is the final state.
http://www.haskell.org/haskellwiki/State_Monad#Implementation
Related
Consider the following code segment
import Control.Monad.State
type Stack = [Int]
pop :: State Stack Int
pop = state $ \(x:xs) -> (x, xs)
push :: Int -> State Stack ()
push a = state $ \xs -> ((), a:xs)
stackManip :: State Stack Int
stackManip = do
push 3
a <- pop
pop
As we know, the do notation in Monad is the same as the >>= operator. We can rewrite this segment to:
push 3 >>= (\_ ->
pop >>= (\a ->
pop))
the final value about this expression is not related to push 3 and the first pop, no matter whatever the input, it just returns the pop, so it won't firstly put the value 3 into the stack and pop it, why would this happen?
Thanks for your reply. I have added the missing code (implementation for Stack, push and pop) above, and I think I have figured out how it works. The key to understanding this code is understanding the implementation of State s monad:
instance Monad (State s) where
return x = State $ \s -> (x, s)
(State h) >>= f = State $ \s -> let (a, newState) = h s
(State g) = f a
in g newState
The whole implementation of >>= do is passing a state to the function h, computing its new state and pass the new state to function g implied in f, getting the newer state. So actually the state has changed implicitly.
Because of the Monad Laws, your code is equivalent to
stackManip :: State Stack Int
stackManip = do
push 3
a <- pop -- a == 3
r <- pop
return r
So you push 3, pop it, ignore the popped 3, pop another value and return it.
Haskell is just another programming language. You the programmer are in control. Whether the compiler skips the inconsequential instructions is up to it, and is unobservable anyway (except by examining the compiler-produced code, or measuring the heat of CPU as it performs your code, but it might be a bit hard to do in the server farm beyond the Polar Circle).
Monads in Haskell are sometimes referred to as "programmable semicolons". It's not a phrase I find particularly helpful in general, but it does capture the way that expressions written with Haskell's do notation have something of the flavour of imperative programs. And in particular that the way the "statements" in a do block get combined is dependent on the particular monad being used. Hence "programmable semicolons" - the way successive "statements" (which in many imperative languages are separated by semicolons) combine together can be changed ("programmed") by using a different monad.
And since do notation is really just syntactic sugar for building up an expression from others using the >>= operator, it's the implementation of >>= for each monad that determines what its "special behaviour" is.
For example, the Monad instance for Maybe allows one, as a rough description, to work with Maybe values as if they are actually values of the underlying type, while ensuring that if a non-value (that is, Nothing) occurs at any point, the computation short-circuits and Nothing will be the overall result.
For the list monad, every line actually gets "executed" multiple times (or none) - once for each element in the list.
And for values of the State s monad, these are essentially "state manipulation functions" of type s -> (a, s) - they take an initial state, and from that compute a new state as well as an output value of some type a. What the >>= implementation - the "semicolon" - does here* is simply ensure that, when one function f :: s -> (a, s) is followed by another g :: s -> (b, s), that the resulting function applies f to the initial state and then applies g to the state computed from f. It's basically just function composition, slightly modified so as to also allow us to access an "output value" whose type is not necessarily related to that of the state. And this allows one to list various state manipulation functions one after another in a do block and know that the state at each stage is exactly that computed by the previous lines put together. This in turn allows a very natural programming style where you give successive "commands" for manipulating the state, yet without actually doing destructive updates, or otherwise departing from the world of pure functions and immutable data.
*strictly speaking, this isn't >>= but >>, an operation which is derived from >>= but ignores the output value. You may have noticed that in the example I gave the a value output by f is totally ignored - but >>= allows that value to be inspected and to determine which computation to do next. In do notation, this means writing a <- f and then using a later. This is actually the key thing which distinguishes Monads from their less powerful, but still vital, cousins (notably Applicative functors).
Does the put function of the State Monad update the actual state or does it just return a new state with the new value? My question is, can the State Monad be used like a "global variable" in an imperative setting? And does put modify the "global variable"?
My understanding was NO it does NOT modify the initial state, but using the monadic interface we can just pass around the new states b/w computations, leaving the initial state "intact". Is this correct? if not, please correct me.
The answer is in the types.
newtype State s a = State {runState :: s -> (a, s)}
Thus, a state is essentially a function that takes one parameter, 's' (which we call the state), and returns a tuple (value, state). The monad is implemented like below
instance Monad (State s) where
return a = State $ \s -> (a,s)
(State f) >>= h = State $ \s -> let (a,s') = f s
in (runState h a) s'
Thus, you have a function that operates on the initial state and spits out a value-state tuple to be processed by the next function in the composition.
Now, put is the following function.
put newState = State $ \s -> ((),newState)
This essentially sets the state that will be passed to the next function in the composition and the downstream function will see the modified state.
In fact, the State monad is completely pure (that is, nothing is being set); only what is passed downstream changes. In other words, the State monad saves you the trouble of carrying around a state explicitly in a pure language like Haskell. In other words, State monad just provides an interface that hides the details of state threading (that's what is called in the WikiBooks and or Learn you a Haskell, I think).
The following shows this in action. You have get, which sets the value field to be the same as the state field (Note that, when I mean setting, I mean the output, not a variable). put gets the state via the value passed to it, increments it and sets the state with this new value.
-- execState :: State s a -> s -> s
let x = get >>= \x -> put (x+10)
execState x 10
The above outputs 20.
Now, let's do the following.
execState (x >> x) 10
This will give an output of 30. The first x sets the state to 20 via the put. This is now used by the second x. The get at this point sets the state passed it to the value field, which is now 20. Now, our put will get this value, increment it by 10 and set this as the new state.
Thus, you have states in a pure context. Hope this helps.
There's nothing magical about State. You could implement it like this:
newtype State s a = State {runState :: s -> (a, s)}
That is, a State s a (which we think of as a computation that uses a state of type s to produce a result of type a) is just a function that takes a state and returns a result and a new state. You should attempt to write out the Monad instance and definitions of get and put for this definition. The real definition is more general:
type State s = StateT s Identity
newtype Identity a = Identity a
newtype StateT s m a = StateT {runStateT :: s -> m (a, s)}
This allows state to be added to other monadic computations. It's also possible to define state transformers as "operational monads". Apfelmus has a tutorial on those somewhere.
First, the state isn't "global" as such; you could have several different copies of the state monad running, each with its own independent state, and they won't interfere with each other. (Indeed, that's arguably the whole point.) If you wanted the state to be global to the entire program, you'd have to put the entire program into a single state monad.
Second, calling put changes the result that subsequent calls to get will return. That's all. It doesn't "change" the actual value itself. Like, if you call get and put the result into a variable somewhere, and then call put, your variable won't change. Even if the state is a dictionary or something, if you were to add a new key and put that, anybody still looking at the old copy of the dictionary will still see the old dictionary. This isn't special to the state monad; it's just how Haskell works.
In reading about monads, I keep seeing phrases like "computations in the Xyz monad". What does it mean for a computation to be "in" a certain monad?
I think I have a fair grasp on what monads are about: allowing computations to produce outputs that are usually of some expected type, but can alternatively or additionally convey some other information, such as error status, logging info, state and so on, and allow such computations to be chained.
But I don't get how a computation would be said to be "in" a monad. Does this just refer to a function that produces a monadic result?
Examples: (search "computation in")
http://www.haskell.org/tutorial/monads.html
http://www.haskell.org/haskellwiki/All_About_Monads
http://ertes.de/articles/monads.html
Generally, a "computation in a monad" means not just a function returning a monadic result, but such a function used inside a do block, or as part of the second argument to (>>=), or anything else equivalent to those. The distinction is relevant to something you said in a comment:
"Computation" occurs in func f, after val extracted from input monad, and before result is wrapped as monad. I don't see how the computation per se is "in" the monad; it seems conspicuously "out" of the monad.
This isn't a bad way to think about it--in fact, do notation encourages it because it's a convenient way to look at things--but it does result in a slightly misleading intuition. Nowhere is anything being "extracted" from a monad. To see why, forget about (>>=)--it's a composite operation that exists to support do notation. The more fundamental definition of a monad are three orthogonal functions:
fmap :: (a -> b) -> (m a -> m b)
return :: a -> m a
join :: m (m a) -> m a
...where m is a monad.
Now think about how to implement (>>=) with these: starting with arguments of type m a and a -> m b, your only option is using fmap to get something of type m (m b), after which you can use join to flatten the nested "layers" to get just m b.
In other words, nothing is being taken "out" of the monad--instead, think of the computation as going deeper into the monad, with successive steps being collapsed into a single layer of the monad.
Note that the monad laws are also much simpler from this perspective--essentially, they say that when join is applied doesn't matter as long as the nesting order is preserved (a form of associativity) and that the monadic layer introduced by return does nothing (an identity value for join).
Does this just refer to a function that produces a monadic result?
Yes, in short.
In long, it's because Monad allows you to inject values into it (via return) but once inside the Monad they're stuck. You have to use some function like evalWriter or runCont which is strictly more specific than Monad to get values back "out".
More than that, Monad (really, its partner, Applicative) is the essence of having a "container" and allowing computations to happen inside of it. That's what (>>=) gives you, the ability to do interesting computations "inside" the Monad.
So functions like Monad m => m a -> (a -> m b) -> m b let you compute with and around and inside a Monad. Functions like Monad m => a -> m a let you inject into the Monad. Functions like m a -> a would let you "escape" the Monad except they don't exist in general (only in specific). So, for conversation's sake we like to talk about functions that have result types like Monad m => m a as being "inside the monad".
Usually monad stuff is easier to grasp when starting with "collection-like" monads as example. Imagine you calculate the distance of two points:
data Point = Point Double Double
distance :: Point -> Point -> Double
distance p1 p2 = undefined
Now you may have a certain context. E.g. one of the points may be "illegal" because it is out of some bounds (e.g. on the screen). So you wrap your existing computation in the Maybe monad:
distance :: Maybe Point -> Maybe Point -> Maybe Double
distance p1 p2 = undefined
You have exactly the same computation, but with the additional feature that there may be "no result" (encoded as Nothing).
Or you have a have a two groups of "possible" points, and need their mutual distances (e.g. to use later the shortest connection). Then the list monad is your "context":
distance :: [Point] -> [Point] -> [Double]
distance p1 p2 = undefined
Or the points are entered by a user, which makes the calculation "nondeterministic" (in the sense that you depend on things in the outside world, which may change), then the IO monad is your friend:
distance :: IO Point -> IO Point -> IO Double
distance p1 p2 = undefined
The computation remains always the same, but happens to take place in a certain "context", which adds some useful aspects (failure, multi-value, nondeterminism). You can even combine these contexts (monad transformers).
You may write a definition that unifies the definitions above, and works for any monad:
distance :: Monad m => m Point -> m Point -> m Double
distance p1 p2 = do
Point x1 y1 <- p1
Point x2 y2 <- p2
return $ sqrt ((x1-x2)^2 + (y1-y2)^2)
That proves that our computation is really independent from the actual monad, which leads to formulations as "x is computed in(-side) the y monad".
Looking at the links you provided, it seems that a common usage of "computation in" is with regards to a single monadic value. Excerpts:
Gentle introduction - here we run a computation in the SM monad, but the computation is the monadic value:
-- run a computation in the SM monad
runSM :: S -> SM a -> (a,S)
All about monads - previous computation refers to a monadic value in the sequence:
The >> function is a convenience operator that is used to bind a monadic computation that does not require input from the previous computation in the sequence
Understanding monads - here the first computation could refer to e.g. getLine, a monadic value :
(binding) gives an intrinsic idea of using the result of a computation in another computation, without requiring a notion of running computations.
So as an analogy, if I say i = 4 + 2, then i is the value 6, but it is equally a computation, namely the computation 4 + 2. It seems the linked pages uses computation in this sense - computation as a monadic value - at least some of the time, in which case it makes sense to use the expression "a computation in" the given monad.
Consider the IO monad. A value of type IO a is a description of a large (often infinite) number of behaviours where a behaviour is a sequence of IO events (reads, writes, etc). Such a value is called a "computation"; in this case it is a computation in the IO monad.
I am trying to get a grasp on Haskell using the online book Learn you a Haskell for great Good.
I have, to my knowledge, been able to understand Monads so far until I hit the chapter introducing the State Monad.
However, the code presented and claimed to be the Monad implementation of the State type (I have not been able to locate it in Hoogle) seems too much for me to handle.
To begin with, I do not understand the logic behind it i.e why it should work and how the author considered this technique.( maybe relevant articles or white-papers can be suggested?)
At line 4, it is suggested that function f takes 1 parameter.
However a few lines down we are presented with pop, which takes no parameters!
To extend on point 1, what is the author trying to accomplish using a function to represent the State.
Any help in understanding what is going on is greatly appreciated.
Edit
To whom it may concern,
The answers below cover my question thoroughly.
One thing I would like to add though:
After reading an article suggested below, I found the answer to my second point above:
All that time I assumed that the pop function would be used like :
stuff >>= pop since in the bind type the second parameter is the function, whereas the correct usage is this pop >>= stuff , which I realized after reading again how do-notation translates to plain bind-lambdas.
The State monad represents stateful computations i.e. computations that use values from, and perhaps modify, some external state. When you sequence stateful computations together, the later computations might give different results depending on how the previous computations modified the state.
Since functions in Haskell must be pure (i.e. have no side effects) we simulate the effect of external state by demanding that every function takes an additional parameter representing the current state of the world, and returns an additional value, representing the modified state. In effect, the external state is threaded through a sequence of computations, as in this abomination of a diagram that I just drew in MSPaint:
Notice how each box (representing a computation) has one input and two outputs.
If you look at the Monad instance for State you see that the definition of (>>=) tells you how to do this threading. It says that to bind a stateful computation c0 to a function f that takes results of a stateful computation and returns another stateful computation, we do the following:
Run c0 using the initial state s0 to get a result and a new state: (val, s1)
Feed val to the function f to get a new stateful computation, c1
Run the new computation c1 with the modified state s1
How does this work with functions that already take n arguments? Because every function in Haskell is curried by default, we simply tack an extra argument (for the state) onto the end, and instead of the normal return value, the function now returns a pair whose second element is the newly modified state. So instead of
f :: a -> b
we now have
f :: a -> s -> (b, s)
You might choose to think of as
f :: a -> ( s -> (b, s) )
which is the same thing in Haskell (since function composition is right associative) which reads "f is a function which takes an argument of type a and returns a stateful computation". And that's really all there is to the State monad.
Short answer:
State is meant to exploit monads' features in order to simulate an imperative-like system state with local variables. The basic idea is to hide within the monad the activity of taking in the current state and returning the new state together with an intermediate result at each step (and here we have s -> (a,s).
Do not mistake arbitrary functions with those wrapped within the State. The former may have whatever type you want (provided that they eventually produce some State a if you want to use them in the state monad). The latter holds functions of type s -> (a,s): this is the state-passing layer managed by the monad.
As I said, the function wrapped within State is actually produced by means of (>>=) and return as they're defined for the Monad (State s) instance. Its role is to pass down the state through the calls of your code.
Point 3 also is the reason why the state parameter disappears from the functions actually used in the state monad.
Long answer:
The State Monad has been studied in different papers, and exists also in the Haskell framework (I don't remember the good references right now, I'll add them asap).
This is the idea that it follows: consider a type data MyState = ... whose values holds the current state of the system.
If you want to pass it down through a bunch of functions, you should write every function in such a way that it takes at least the current state as a parameter and returns you a pair with its result (depending on the state and the other input parameters) and the new (possibly modified) state. Well, this is exactly what the type of the state monad tells you: s -> (a, s). In our example, s is MyState and is meant to pass down the state of the system.
The function wrapped in the State does not take parameters except from the current state, which is needed to produce as a result the new state and an intermediate result. The functions with more parameters that you saw in the examples are not a problem, because when you use them in the do-notation within the monad, you'll apply them to all the "extra" needed parameters, meaning that each of them will result in a partially applied function whose unique remaining parameter is the state; the monad instance for State will do the rest.
If you look at the type of the functions (actually, within monads they are usually called actions) that may be used in the monad, you'll see that they result type is boxed within the monad: this is the point that tells you that once you give them all the parameters, they will actually do not return you the result, but (in this case) a function s -> (a,s) that will fit within the monad's composition laws.
The computation will be executed by passing to the whole block/composition the first/initial state of the system.
Finally, functions that do not take parameters will have a type like State a where a is their return type: if you have a look at the value constructor for State, you'll see again that this is actually a function s -> (a,s).
I'm total newbie to Haskell and I couldn't understand well the State Monad code in that book, too. But let me add my answer here to help someone in the future.
Answers:
What are they trying to accomplish with State Monad ?
Composing functions which handle stateful computation.
e.g. push 3 >>= \_ -> push 5 >>= \_ -> pop
Why pop takes no parameters, while it is suggested function f takes 1 parameter ?
pop takes no arguments because it is wrapped by State.
unwapped function which type is s -> (a, s) takes one argument. the
same goes for push.
you can unwrapp with runState.
runState pop :: Stack -> (Int, Stack)
runState (push 3) :: Stack -> ((), Stack)
if you mean the right-hand side of the >>= by the "function f", the f will be like \a -> pop or \a -> push 3, not just pop.
Long Explanation:
These 3 things helped me to understand State Monad and the Stack example a little more.
Consider the types of the arguments for bind operator(>>=)
The definition of the bind operator in Monad typeclass is this
(>>=) :: (Monad m) => m a -> (a -> m b) -> m b
In the Stack example, m is State Stack.
If we mentaly replace m with State Stack, the definition can be like this.
(>>=) :: State Stack a -> (a -> State Stack b) -> State Stack b
Therefore, the type of left side argument for the bind operator will be State Stack a.
And that of right side will be a -> State Stack b.
Translate do notation to bind operator
Here is the example code using do notation in the book.
stackManip :: State Stack Int
stackManip = do
push 3
pop
pop
it can be translated to the following code with bind operator.
stackManip :: State Stack Int
stackManip = push 3 >>= \_ -> pop >>= \_ -> pop
Now we can see what will be the right-hand side for the bind operator.
Their types are a -> State Stack b.
(\_ -> pop) :: a -> State Stack Int
(\_ -> push 3) :: a -> State Stack ()
Recognize the difference between (State s) and (State h) in the instance declaration
Here is the instance declaration for State in the book.
instance Monad (State s) where
return x = State $ \s -> (x,s)
(State h) >>= f = State $ \s -> let (a, newState) = h s
(State g) = f a
in g newState
Considering the types with the Stack example, the type of (State s) will be
(State s) :: State Stack
s :: Stack
And the type of (State h) will be
(State h) :: State Stack a
h :: Stack -> (a, Stack)
(State h) is the left-hand side argument of the bind operator and its type is State Stack a as described above.
Then why h becomes Stack -> (a, Stack) ?
It is the result of pattern matching against the State value constructor which is defined in the newtype wrapper. The same goes for the (State g).
newtype State s a = State { runState :: s -> (a,s) }
In general, type of h is s ->(a, s), representation of the stateful computation. Any of followings could be the h in the Stack example.
runState pop :: Stack -> (Int, Stack)
runState (push 3) :: Stack -> ((), Stack)
runState stackManip :: Stack -> (Int, Stack)
that's it.
The State monad is essentially
type State s a = s -> (a,s)
a function from one state (s) to a pair of the desired result (a) and a new state. The implementation makes the threading of the state implicit and handles the state-passing and updating for you, so there's no risk of accidentally passing the wrong state to the next function.
Thus a function that takes k > 0 arguments, one of which is a state and returns a pair of something and a new state, in the State s monad becomes a function taking k-1 arguments and returning a monadic action (which basically is a function taking one argument, the state, here).
In the non-State setting, pop takes one argument, the stack, which is the state. So in the monadic setting, pop becomes a State Stack Int action taking no explicit argument.
Using the State monad instead of explicit state-passing makes for cleaner code with fewer opportunities for error, that's what the State monad accomplishes. Everything could be done without it, it would just be more cumbersome and error-prone.
I have a function:
test :: String -> State String String
test x =
get >>= \test ->
let test' = x ++ test in
put test' >>
get >>= \test2 -> put (test2 ++ x) >>
return "test"
I can pretty much understand what goes on throughout this function, and I'm starting to get the hang of monads. What I don't understand is how, when I run this:
runState (test "testy") "testtest"
the 'get' function in 'test' somehow gets the initial state "testtest". Can someone break this down and explain it to me?
I appreciate any responses!
I was originally going to post this as a comment, but decided to expound a bit more.
Strictly speaking, get doesn't "take" an argument. I think a lot of what is going on is masked by what you aren't seeing--the instance definitions of the State monad.
get is actually a method of the MonadState class. The State monad is an instance of MonadState, providing the following definition of get:
get = State $ \s -> (s,s)
In other words, get just returns a very basic State monad (remembering that a monad can be thought of as a "wrapper" for a computation), where any input s into the computation will return a pair of s as the result.
The next thing we need to look at is >>=, which State defines thusly:
m >>= k = State $ \s -> let
(a, s') = runState m s
in runState (k a) s'
So, >>= is going to yield a new computation, which won't be computed until it gets an initial state (this is true of all State computations when they're in their "wrapped" form). The result of this new computation is achieved by applying whatever is on the right side of the >>= to the result of running the computation that was on the left side. (That's a pretty confusing sentence that may require an additional reading or two.)
I've found it quite useful to "desugar" everything that's going on. Doing so takes a lot more typing, but should make the answer to your question (where get is getting from) very clear. Note that the following should be considered psuedocode...
test x =
State $ \s -> let
(a,s') = runState (State (\s -> (s,s))) s --substituting above defn. of 'get'
in runState (rightSide a) s'
where
rightSide test =
let test' = x ++ test in
State $ \s2 -> let
(a2, s2') = runState (State $ \_ -> ((), test')) s2 -- defn. of 'put'
in runState (rightSide2 a2) s2'
rightSide2 _ =
-- etc...
That should make it obvious that the end result of our function is a new State computation that will need an initial value (s) to make the rest of the stuff happen. You supplied s as "testtest" with your runState call. If you substitute "testtest" for s in the above pseudocode, you'll see that the first thing that happens is we run get with "testtest" as the 'initial state'. This yields ("testtest", "testtest") and so on.
So that's where get gets your initial state "testtest". Hope this helps!
It might help you to take a deeper look at what the State type constructor really is, and how runState uses it. In GHCi:
Prelude Control.Monad.State> :i State
newtype State s a = State {runState :: s -> (a, s)}
Prelude Control.Monad.State> :t runState
runState :: State s a -> s -> (a, s)
State takes two arguments: the type of the state, and the returned type. It is implemented as a function taking the initial state and yielding a return value and the new state.
runState takes such a function, the initial input, and (most probably) just applies one to the other to retrieve the (result,state) pair.
Your test function is a big composition of State-type functions, each taking a state input and yielding a (result,state) output, plugged in one another in a way that makes sense to your program. All runState does is provide them with a state starting point.
In this context, get is simply a function that takes state as an input, and returns a (result,state) output such that the result is the input state, and the state is unchanged (the output state is the input state). In other words, get s = (s, s)
Going through chapter 8 ("Functional Parsers") of Graham Hutton's Programming in Haskell several times until I'd properly understood it, followed by a go at the tutorial All About Monads, made this click for me.
The issue with monads is that they are very useful for several things that those of us coming from the usual programming background find quite dissimilar. It takes some time to appreciate that control flow and handling state are not only similar enough that they can be handled by the same mechanism, but are when you step back far enough, the same thing.
An epiphany came when I was considering control structures in C (for and while, etc.), and I realized that by far the most common control structure was simply putting one statement before the other one. It took a year of studying Haskell before I realized that that even was a control structure.