Looking at Learn You a Haskell's definition of the State Monad:
instance Monad (State s) where
return x = State $ \s -> (x,s)
(State h) >>= f = State $ \s -> let (a, newState) = h s
(State g) = f a
in g newState
I don't understand the types of h s and g newState in the lower right-hand side.
Can you please explain their types and what's going on?
State s a is a naming of a function---the "state transformer function"
s -> (a, s)
In other words, it takes an input state s and modifies that state while also returning a result, a. This forms a really general framework of "pure state". If our state is an integer, we can write a function which updates that integer and returns the new value---this is like a unique number source.
upd :: Int -> (Int, Int)
upd s = let s' = s + 1 in (s', s')
Here, a and s end up being the same type.
Now this is all fine and good, except that we're in trouble if we'd like to get two fresh numbers. For that we must somehow run upd twice.
The final result is going to be another state transformer function, so we're looking for a "state transformer transformer". I'll call it compose:
compose :: (s -> (a, s)) -- the initial state transformer
-> (a -> (s -> (b, s))) -- a new state transformer, built using the "result"
-- of the previous one
-> (s -> (b, s)) -- the result state transformer
This is a little hairy looking, but honestly it's fairly easy to write this function. The types guide you to the answer:
compose f f' = \s -> let (a, s') = f s
(b, s'') = f' a s'
in (b, s'')
You'll notice that the s-typed variables, [s, s', s''] "flow downward" indicating that state moves from the first computation through the second leading to the result.
We can use compose to build a function which gets two unique numbers using upd
twoUnique :: Int -> ((Int, Int), Int)
twoUnique = compose upd (\a s -> let (a', s') = upd s in ((a, a'), s'))
These are the basics of State. The only difference is that we recognize there's a common pattern going on inside of the compose function and we extract it. That pattern looks like
(>>=) :: State s a -> (a -> State s b ) -> State s b
(>>=) :: (s -> (a, s)) -> (a -> (s -> (b, s)) -> (s -> (b, s))
It's implemented the same way, too. We just need to "wrap" and "unwrap" the State bit---that's the purpose of State and runState
State :: (s -> (a, s)) -> State s a
runState :: State s a -> (s -> (a, s))
Now we can take compose and compare it to (>>=)
compose f f' = \s -> let (a, s') = f s
(b, s'') = f' a s'
in (b, s'')
(>>=) (State f) f' = State $ \s -> let (a, s') = f s
(b, s'') = runState (f' a) s'
in (b, s'')
The State Monad certainly is confusing the first time you see it. The first thing that's important to understand is its data declaration, which is
newtype State s a = State { runState :: s -> (a,s) }
so a State contains a function with the type s -> (a,s). We can think of this as a function acting on some sort of generator and returning a tuple of a value and a new generator. This is how random numbers work in Haskell, for example: s is the generator while a is the result of the function that takes a generator as input and outputs a random number a (say, of type Int, but it could just as easily be any other type).
Now let's talk about the instance declaration. Recall the type of (>>=) is
Monad m => m a -> (a -> m b) -> m b
In particular, we note that f should have the type a -> m b. In this case, m is State s, so the type of f should be a -> State s b. So now we can break down the instance declaration
(State h) >>= f = State $ \s -> let (a, newState) = h s
(State g) = f a
in g newState
Since f has the type a -> State s b, the type of State g must be State s b (i.e. g :: s -> (b,s)), and since h has the type s -> (a,s), we must have newState :: s. Thus the result of the bind expression is g newState, which is of type (b, s).
For further reading, here is a great article that helped me to understand the State Monad when I first came across it.
From the definition of the State monad at LYAH:
newtype State s a = State { runState :: s -> (a,s) }
This means the argument to the State data constructor is a function which takes a state and produces an a and a new state. Thus h in the example above is a function, and h s computes a and newState.
From Hoogle we see the definition of (>>=) is
(>>=) :: Monad m => m a -> (a -> m b) -> m b
which means f is also a function from a to State s b. Thus it makes sense to give f the argument a, and the result is a State. Just like h, g is the argument to a state constructor which takes a state (in this case newstate) and return a pair (a,newState2).
It might be more instructive to ask what (>>=) actually does: it lifts the function argument to a monad. A State is just a placeholder for a value depending on the current state, which is why the argument to the constructor depends on the state. Thus given a State "value", we first apply the state \s -> let (a, newState) = h s to get the corresponding value and a new state. Now we pass that value to the function (note that the types match up) and get a new state, i.e. a new function from a state to a value. Finally, we evaluate that state at newState to thread the state to the next part of the computation.
Related
I'm trying to create an instance for bind operator (>>=) to the custom type ST a
I found this way to do it but I don't like that hardcoded 0.
Is there any way to implement it without having the hardcoded 0 and respecting the type of the function?
newtype ST a = S (Int -> (a, Int))
-- This may be useful to implement ">>=" (bind), but it is not mandatory to use it
runState :: ST a -> Int -> (a, Int)
runState (S s) = s
instance Monad ST where
return :: a -> ST a
return x = S (\n -> (x, n))
(>>=) :: ST a -> (a -> ST b) -> ST b
s >>= f = f (fst (runState s 0))
I often find it easier to follow such code with a certain type of a pseudocode rewrite, like this: starting with the
instance Monad ST where
return :: a -> ST a
return x = S (\n -> (x, n))
we get to the
runState (return x) n = (x, n)
which expresses the same thing exactly. It is now a kind of a definition through an interaction law that it must follow. This allows me to ignore the "noise"/wrapping around the essential stuff.
Similarly, then, we have
(>>=) :: ST a -> (a -> ST b) -> ST b
s >>= f = -- f (fst (runState s 0)) -- nah, 0? what's that?
--
-- runState (s >>= f) n = runState (f a) i where
-- (a, i) = runState s n
--
S $ \ n -> let (a, i) = runState s n in
runState (f a) i
because now we have an Int in sight (i.e. in scope), n, that will get provided to us when the combined computation s >>= f will "run". I mean, when it will runState.
Of course nothing actually runs until called upon from main. But it can be a helpful metaphor to hold in mind.
The way we've defined it is both the easiest and the most general, which is usually the way to go. There are more ways to make the types fit though.
One is to use n twice, in the input to the second runState as well, but this will leave the i hanging unused.
Another way is to flip the time arrow around w.r.t. the state passing, with
S $ \ n -> let (a, i2) = runState s i
(b, i ) = runState (f a) n
in (b, i2)
which is a bit weird to say the least. s still runs first (as expected for the s >>= f combination) to produce the value a from which f creates the second computation stage, but the state is being passed around in the opposite direction.
The most important thing to keep in mind is that your ST type is a wrapper around a function. What if you started your definition as (>>=) = \s -> \f -> S (\n -> ... )? It might be (ok, is) a bit silly to write separate lambdas for the s and f parameters there, but I did it to show that they're not really any different from the n parameter. You can use it in your definition of (>>=).
I'm using the FreeT type from the free library to write this function which "runs" an underlying StateT:
runStateFree
:: (Functor f, Monad m)
=> s
-> FreeT f (StateT s m) a
-> FreeT f m (a, s)
runStateFree s0 (FreeT x) = FreeT $ do
flip fmap (runStateT x s0) $ \(r, s1) -> case r of
Pure y -> Pure (y, s1)
Free z -> Free (runStateFree s1 <$> z)
However, I'm trying to convert it to work on FT, the church-encoded version, instead:
runStateF
:: (Functor f, Monad m)
=> s
-> FT f (StateT s m) a
-> FT f m (a, s)
runStateF s0 (FT x) = FT $ \ka kf -> ...
but I'm not quite having the same luck. Every sort of combination of things I get seems to not quite work out. The closest I've gotten is
runStateF s0 (FT x) = FT $ \ka kf ->
ka =<< runStateT (x pure (\n -> _ . kf (_ . n)) s0
But the type of the first hole is m r -> StateT s m r and the type the second hole is StateT s m r -> m r...which means we necessarily lose the state in the process.
I know that all FreeT functions are possible to write with FT. Is there a nice way to write this that doesn't involve round-tripping through FreeT (that is, in a way that requires explicitly matching on Pure and Free)? (I've tried manually inlining things but I don't know how to deal with the recursion using different ss in the definition of runStateFree). Or maybe this is one of those cases where the explicit recursive data type is necessarily more performant than the church (mu) encoding?
Here's the definition. There are no tricks in the implementation itself. Don't think and make it type check. Yes, at least one of these fmap is morally questionable, but the difficulty is actually to convince ourselves it does the Right thing.
runStateF
:: (Functor f, Monad m)
=> s
-> FT f (StateT s m) a
-> FT f m (a, s)
runStateF s0 (FT run) = FT $ \return0 handle0 ->
let returnS a = StateT (\s -> fmap (\r -> (r, s)) (return0 (a, s)))
handleS k e = StateT (\s -> fmap (\r -> (r, s)) (handle0 (\x -> evalStateT (k x) s) e))
in evalStateT (run returnS handleS) s0
We have two stateless functions (i.e., plain m)
return0 :: a -> m r
handle0 :: forall x. (x -> m r) -> f x -> m r
and we must wrap them in two stateful (StateT s m) variants with the signatures below. The comments that follow give some details about what is going on in the definition of handleS.
returnS :: a -> StateT s m r
handleS :: forall x. (x -> StateT s m r) -> f x -> StateT s m r
-- 1. -- ^ grab the current state 's' here
-- 2. -- ^ call handle0 to produce that 'm'
-- 3. ^ here we will have to provide some state 's': pass the current state we just grabbed.
-- The idea is that 'handle0' is stateless in handling 'f x',
-- so it is fine for this continuation (x -> StateT s m r) to get the state from before the call to 'handle0'
There is an apparently dubious use of fmap in handleS, but it is valid as long as run never looks at the states produced by handleS. It is almost immediately thrown away by one of the evalStateT.
In theory, there exist terms of type FT f (StateT s m) a which break that invariant. In practice, that almost certainly doesn't occur; you would really have to go out of your way to do something morally wrong with those continuations.
In the following complete gist, I also show how to test with QuickCheck that it is indeed equivalent to your initial version using FreeT, with concrete evidence that the above invariant holds:
https://gist.github.com/Lysxia/a0afa3ca2ea9e39b400cde25b5012d18
I'd say that no, as even something as simple as cutoff converts to FreeT:
cutoff :: (Functor f, Monad m) => Integer -> FT f m a -> FT f m (Maybe a)
cutoff n = toFT . FreeT.cutoff n . fromFT
In general, you're probably looking at:
improve :: Functor f => (forall m. MonadFree f m => m a) -> Free f a
Improve the asymptotic performance of code that builds a free monad with only binds and returns by using F behind the scenes.
I.e. you'll construct Free efficiently, but then do whatever you need to do with Free (maybe again, by improveing).
I looked hard to see if this may be a duplicate question but couldn't find anything that addressed specifically this. My apologies if there actually is something.
So, I get how lift works, it lifts a monadic action (fully defined) from the outer-most transformer into the transformed monad. Cool.
But what if I want to apply a (>>=) from one level under the transformer into the transformer? I'll explain with an example.
Say MyTrans is a MonadTrans, and there is also an instance Monad m => Monad (MyTrans m). Now, the (>>=) from this instance will have this signature:
instance Monad m => Monad (MyTrans m) where
(>>=) :: MyTrans m a -> (a -> MyTrans m b) -> MyTrans m b
but what I need is something like this:
(>>=!) :: Monad m => MyTrans m a -> (m a -> MyTrans m b) -> MyTrans m b
In general:
(>>=!) :: (MonadTrans t, Monad m) => t m a -> (m a -> t m b) -> t m b
It looks like a combination of the original (>>=) and lift, except it really isn't. lift can only be used on covariant arguments of type m a to transform them into a t m a, not the other way around. In other words, the following has the wrong type:
(>>=!?) :: Monad m => MyTrans m a -> (a -> m b) -> MyTrans m b
x >>=!? f = x >>= (lift . f)
Of course a general colift :: (MonadTrans t, Monad m) => t m a -> m a makes absolutely zero sense, because surely the transformer is doing something that we cannot just throw away like that in all cases.
But just like (>>=) introduces contravariant arguments into the monad by ensuring that they will always "come back", I thought something along the lines of the (>>=!) function would make sense: Yes, it in some way makes an m a from a t m a, but only because it does all of this within t, just like (>>=) makes an a from an m a in some way.
I've thought about it and I don't think (>>=!) can be in general defined from the available tools. In some sense it is more than what MonadTrans gives. I haven't found any related type classes that offer this either. MFunctor is related but it is a different thing, for changing the inner monad, but not for chaining exclusively transformer-related actions.
By the way, here is an example of why you would want to do this:
EDIT: I tried to present a simple example but I realized that that one could be solved with the regular (>>=) from the transformer. My real example (I think) cannot be solved with this. If you think every case can be solved with the usual (>>=), please do explain how.
Should I just define my own type class for this and give some basic implementations? (I'm interested in StateT, and I'm almost certain it can be implemented for it) Am I doing something in a twisted way? Is there something I overlooked?
Thanks.
EDIT: The answer provided by Fyodor matches the types, but does not do what I want, since by using pure, it is ignoring the monadic effects of the m monad. Here is an example of it giving the wrong answer:
Take t = StateT Int and m = [].
x1 :: StateT Int [] Int
x1 = StateT (\s -> [(1,s),(2,s),(3,s)])
x2 :: StateT Int [] Int
x2 = StateT (\s -> [(1,s),(2,s),(3,s),(4,s))])
f :: [Int] -> StateT Int [] Int
f l = StateT (\s -> if (even s) then [] else (if (even (length l)) then (fmap (\z -> (z,z+s)) l) else [(123,123)]))
runStateT (x1 >>= (\a -> f (pure a))) 1 returns [(123,123),(123,123),(123,123)] as expected, since both 1 is odd and the list in x1 has odd length.
But runStateT (x2 >>= (\a -> f (pure a))) 1 returns [(123,123),(123,123),(123,123),(123,123)], whereas I would have expected it to return [(1,2),(2,3),(3,4),(4,5)], since the 1 is odd and the length of the list is even. Instead, the evaluation of f is happening on the lists [(1,1)], [(2,1)], [(3,1)] and [(4,1)] independently, due to the pure call.
This can be very trivially implemented via bind + pure. Consider the signature:
(>>=!) :: (Monad m, MonadTrans t) => t m a -> (m a -> t m a) -> t m a
If you use bind on the first argument, you get yourself a naked a, and since m is a Monad, you can trivially turn that naked a into an m a via pure. Therefore, the straightforward implementation would be:
(>>=!) x f = x >>= \a -> f (pure a)
And because of this, bind is always strictly more powerful than your proposed new operation (>>=!), which is probably the reason it doesn't exist in the standard libraries.
I think it may be possible to propose more clever interpretations of (>>=!) for some specific transformers or specific underlying monads. For example, if m ~ [], one might imagine passing the whole list as m a instead of its elements one by one, as my generic implementation above would do. But this sort of thing seems too specific to be implemented in general.
If you have a very specific example of what you're after, and you can show that my above general implementation doesn't work, then perhaps I can provide a better answer.
Ok, to address your actual problem from the comments:
I have a function f :: m a -> m b -> m c that I want to transform into a function ff :: StateT s m a -> StateT s m b -> StateT s m c
I think looking at this example may illustrate the difficulty better. Consider the required signature:
liftish :: Monad m => (m a -> m b -> m c) -> StateT m a -> StateT m b -> StateT m c
Presumably, you'd want to keep the effects of m that are already "imprinted" within the StateT m a and StateT m b parameters (because if you don't - my simple solution above will work). To do this, you can "unwrap" the StateT via runStateT, which will get you m a and m b respectively, which you can then use to obtain m c:
liftish f sa sb = do
s <- get
let ma = fst <$> runStateT sa s
mb = fst <$> runStateT sb s
lift $ f ma mb
But here's the trouble: see those fst <$> in there? They are throwing away the resulting state. The call to runStateT sa s results not only in the m a value, but also in the new, modified state. And same goes for runStateT sb s. And presumably you'd want to get the state that resulted from runStateT sa and pass it to runStateT sb, right? Otherwise you're effectively dropping some state mutations.
But you can't get to the resulting state of runStateT sa, because it's "wrapped" inside m. Because runStateT returns m (a, s) instead of (m a, s). If you knew how to "unwrap" m, you'd be fine, but you don't. So the only way to get that intermediate state is to run the effects of m:
liftish f sa sb = do
s <- get
(c, s'') <- lift $ do
let ma = runStateT sa s
(_, s') <- ma
let mb = runStateT sb s'
(_, s'') <- mb
c <- f (fst <$> ma) (fst <$> mb)
pure (c, s'')
put s''
pure c
But now see what happens: I'm using ma and mb twice: once to get the new states out of them, and second time by passing them to f. This may lead to double-running effects or worse.
This problem of "double execution" will, I think, show up for any monad transformer, simply because the transformer's effects are always wrapped inside the underlying monad, so you have a choice: either drop the transformer's effects or execute the underlying monad's effects twice.
I think what you "really want" is
(>>>==) :: MyTrans m a -> (forall b. m b -> MyTrans n b) -> MyTrans n a
-- (=<<) = flip (>>=) is nicer to think about, because it shows that it's a form of function application
-- so let's think about
(==<<<) :: (forall a. m b -> MyTrans n b) -> (forall a. MyTrans m a -> MyTrans n a)
-- hmm...
type (~>) a b = forall x. a x -> b x
(==<<<) :: (m ~> MyTrans n) -> MyTrans m ~> MyTrans n
-- look familiar?
That is, you are describing monads on the category of monads.
class MonadTrans t => MonadMonad t where
-- returnM :: m ~> t m
-- but that's just lift, therefore the MonadTrans t superclass
-- note: input must be a monad homomorphism or else all bets are off
-- output is also a monad homomorphism
(==<<<) :: (Monad m, Monad n) => (m ~> t n) -> t m ~> t n
instance MonadMonad (StateT s) where
-- fairly sure this is lawful
-- EDIT: probably not
f ==<<< StateT x = do
(x, s) <- f <$> x <$> get
x <$ put s
However, making your example work is just not going to happen. It is too unnatural. StateT Int [] is the monad for programs that nondeterministically evolve the state. It is an important property of that monad that each "parallel universe" receives no communication from the others. The specific operation you are performing will probably not be provided by any useful typeclass. You can only do part of it:
f :: [] ~> StateT Int []
f l = StateT \s -> if odd s && even (length l) then fmap (\x -> (x, s)) l else []
f ==<<< x1 = []
f ==<<< x2 = [(1,1),(2,1),(3,1),(4,1)]
the haskell wiki (here : https://wiki.haskell.org/State_Monad ) says the state monad bind operator is defined like this :
(>>=) :: State s a -> (a -> State s b) -> State s b
(act1 >>= fact2) s = runState act2 is
where (iv,is) = runState act1 s
act2 = fact2 iv
however it seems incorrect to me as the result of the bind operator is a function wrapped in a constructor thus cannot be applied (I'm talking about this pattern : (act1 >>= fact2) s)
In short: A State object itself does not encapsulate the state, it encapsulates the change of a state.
Indeed, the State type is defined as:
newtype State s a = State { runState :: s -> (a, s) }
where runState is thus a function that takes a state s, and returns a result a and a new state.
The bind operator (>>=) :: State s a -> (a -> State s b) -> State s b basically "chains" state changes together. It thus takes one state changing function f1 :: s -> (a, s), and a function f2 :: a -> State s b, and thus creates a function g :: s -> (b, s) so to speak that is encapsulated in a State constructor. The second function f2 thus takes an a and returns such state changing function as well.
So the bind operator can be defined as:
(State f1) >>= f2 = State $ \i -> let (y, s) = f1 i in runState (f2 y) s
Here we have i as initial state, and we thus will first "chain" i through the f1 state changer. This returns then a 2-tuple: y is the "result" of that call, and s is the new state, we will then pass the result and the new state to f2. Note that here we do not make state changes at all, we only construct a State object that can do that. We thus postpone the real chaining.
If the State is however defined as above, then the piece of code does not match that definition, it defines it, like #HTWN says, as:
type State s a = s -> (a, s)
In that case, it is correct, given that runState is then the id function, since then:
(>>=) :: State s a -> (a -> State s b) -> State s b
(>>=) act1 fact2 = f
where f s = act2 is
where (iv,is) = act1 s
act2 = fact2 iv
In order to make it compatible with our State type, we thus add some logic to unwrap and wrap it in the State data constructor:
(>>=) :: State s a -> (a -> State s b) -> State s b
(>>=) act1 fact2 = State f
where f s = runState act2 is
where (iv,is) = runState act1 s
act2 = fact2 iv
then it is indeed correct. The main error is not wrapping it in a State data constructor.
How to formally calculate/interpret the following expression?
runState (join (State $ \s -> (push 10,1:2:s))) [0,0,0]
I understand the informal explanation, which says: first run the outer stateful computation and then the resulting one.
Well, that's quite strange to me since if I follow the join and >>= definitions, it looks to me like I have to start from the internal monad (push 10) as the parameter of the id, and then do... hmmmm... well... I'm not sure what.... in order to get what is supposedly the result:
((),[10,1,2,0,0,0])
However how to explain it by the formal definitions:
instance Monad (State s) where
return x = State $ \s -> (x,s)
(State h) >>= f = State $ \s -> let (a, newState) = h s
(State g) = f a
in g newState
and
join :: Monad m => m (m a) -> m a
join n = n >>= id
Also, the definition of the State Monad's bind (>>=) is quite hard to grasp as having some "intuitive"/visual meaning (as opposed to just a formal definition that would satisfy the Monad laws). Does it have a less formal and more intuitive meaning?
The classic definition of State is pretty simple.
newtype State s a = State {runState :: s -> (a,s) }
A State s a is a "computation" (actually just a function) that takes something of type s (the initial state) and produces something of type a (the result) and something of type s (the final state).
The definition you give in your question for >>= makes State s a a "lazy state transformer". This is useful for some things, but a little harder to understand and less well-behaved than the strict version, which goes like this:
m >>= f = State $ \s ->
case runState m s of
(x, s') -> runState (f x) s'
I've removed the laziness and also taken the opportunity to use a record selector rather than pattern matching on State.
What's this say? Given an initial state, I runState m s to get a result x and a new state s'. I apply f to x to get a state transformer, and then run that with initial state s'.
The lazy version just uses lazy pattern matching on the tuple. This means that the function f can try to produce a state transformer without inspecting its argument, and that transformer can try to run without looking at the initial state. You can use this laziness in some cases to tie recursive knots, implement funny functions like mapAccumR, and use state in lazy incremental stream processing, but most of the time you don't really want/need that.
Lee explains pretty well what join does, I think.
If you specialise the type of join for State s you get:
join :: State s (State s a) -> State s a
so given a stateful computation which returns a result which is another stateful computation, join combines them into a single one.
The definition of push is not given in your question but I assume it looks like:
push :: a -> State [a] ()
push x = modify (x:)
along with some State type like
data State s a = State (s -> (a, s))
A value of State s a is a function which, given a value for the current state of type s returns a pair containing a result of type a and a new state value. Therefore
State $ \s -> (push 10,1:2:s)
has type State [Int] (State [Int] ()) (or some other numeric type other than Int. The outer State function returns as its result another State computation, and updates the state to have the values 1 and 2 pushed onto it.
An implementation of join for this State type would look like:
join :: State s (State s a) -> State s a
join outer = State $ \s ->
let (inner, s') = runState outer s
in runState inner s'
so it constructs a new stateful computation which first runs the outer computation to return a pair containing the inner computation and the new state. The inner computation is then run with the intermediate state.
If you plug your example into this definition then
outer = (State $ \s -> (push 10,1:2:s))
s = [0,0,0]
inner = push 10
s' = [1,2,0,0,0]
and the result is therefore the result of runState (push 10) [1,2,0,0,0] which is ((),[10,1,2,0,0,0])
You mentioned following the definitions for join and >>=, so, let's try that.
runState (join (State $ \s -> (push 10,1:2:s))) [0,0,0] = ?
The definitions are, again
instance Monad (State s) where
-- return :: a -> State s a
return x = State $ \s -> (x,s)
so for x :: a, State $ \s -> (x,s) :: State s a; (*) ---->
(State h) >>= f = State $ \s -> let (a, newState) = h s
(State g) = f a
in g newState
join m = m >>= id
and runState :: State s a -> s -> (a, s), i.e. it should be (*) <----
runState (State g) s = g s. So, following the definitions we have
runState (join (State $ \s -> (push 10,1:2:s))) [0,0,0]
= runState (State g) [0,0,0]
where (State g) = join (State $ \s -> (push 10,1:2:s))
= (State $ \s -> (push 10,1:2:s)) >>= id
-- (State h ) >>= f
= State $ \s -> let (a, newState) = h s
(State g) = id a
h s = (push 10,1:2:s)
in g newState
= State $ \s -> let (a, newState) = (push 10,1:2:s)
(State g) = a
in g newState
= State $ \s -> let (State g) = push 10
in g (1:2:s)
Now, push 10 :: State s a is supposed to match with State g where g :: s -> (a, s); most probably it's defined as push 10 = State \s-> ((),(10:) s); so we have
= State $ \s -> let (State g) = State \s-> ((),(10:) s)
in g (1:2:s)
= State $ \s -> let g s = ((),(10:) s)
in g (1:2:s)
= State $ \s -> ((),(10:) (1:2:s))
= runState (State $ \s -> ((),(10:) (1:2:s)) ) [0,0,0]
= (\s -> ((),(10:) (1:2:s))) [0,0,0]
= ((), 10:1:2:[0,0,0])
. So you see that push 10 is first produced as a result-value (with (a, newState) = (push 10,1:2:s)); then it is treated as the computation-description of type State s a, so is run last (not first, as you thought).
As Lee describes, join :: State s (State s a) -> State s a; the meaning of this type is, a computation of type State s (State s a) is one that produces State s a as its result-value, and that is push 10; we can run it only after we get hold of it.