Initial state of state monad and how it first gets passed along - haskell

I was reading about the State monad and after writing the code below I cannot understand how the initial state (mkStdGen 42) is passed into the rollDice function?
How can get work on that initial value?
I understand how the rollDice function works, but I cannot imagine how the initial state is lifted into the monad.
import Control.Monad.Trans.State
import System.Random
rollDice :: State StdGen Int
rollDice = do
generator <- get
let (x, s) = randomR (1,6) generator
put s
return x
main :: IO ()
main = putStrLn . show $ evalState rollDice (mkStdGen 42)

To keep things simple, think about a State s a as a function, which takes an initial state s and returns a value and a new state (a , s), just as in the wikibook:
newtype State s a = S { runState :: s -> (s , a) }
Now lets use rollDicePure = runState rollDice. What's rollDicePure type? Given that runState only removes the State newtype, it should be
rollDicePure :: StdGen -> (Int, StdGen)
So runState (or evalState/execState) simply turn a stateful computation into a simple function, whose argument is the initial state.
If you wonder how get then gets this state, think of it as
getPure :: s -> (s, s)
getPure st = (st, st)
The actual definitions are a little bit more complicated than that, but the semantics hold.

Related

Generate a sequential or random value each time a function is called

I need to make each instance of Sphere get a unique identifier so that no two Spheres are equal. I won't know ahead of time how many spheres I'll need to make so will need to make them one at a time, but still increment the identifier.
Most solutions I've tried have this issue where I end up with an IO a and need the unsafePerformIO to get the value.
This code comes close, but the resulting identifier is always the same:
module Shape ( Sphere (..)
, sphere
, newID
) where
import System.Random
import System.IO.Unsafe (unsafePerformIO)
data Sphere = Sphere { identifier :: Int
} deriving (Show, Eq)
sphere :: Sphere
sphere = Sphere { identifier = newID }
newID :: Int
newID = unsafePerformIO (randomRIO (1, maxBound :: Int))
This would work as well, and works great in the REPL, but when I put it in a function, it only returns a new value the first time and the same value after that.
import Data.Unique
sphere = Sphere { identifier = (hashUnique $ unsafePerformIO newUnique) }
I know think this all leads to the State Monad, but I don't understand that yet. Is there no other way that will "get the job done", without biting off all the other monad stuff?
First of all, don’t use unsafePerformIO here. It doesn’t do what you want anyway: it doesn’t “get the a out of an IO a”, since an IO a doesn’t contain an a; rather, unsafePerformIO hides an IO action behind a magical value that executes the action when somebody evaluates the value, which could happen multiple times or never because of laziness.
Is there no other way that will "get the job done", without biting off all the other monad stuff?
Not really. You’re going to have to maintain some kind of state if you want to generate unique IDs. (You may be able to avoid needing unique IDs altogether, but I don’t have enough context to say.) State can be handled in a few ways: manually passing values around, using State to simplify that pattern, or using IO.
Suppose we want to generate sequential IDs. Then the state is just an integer. A function that generates a fresh ID can simply take that state as input and return an updated state. I think you’ll see straight away why that’s too simple, so we tend to avoid writing code like this:
-- Differentiating “the next-ID state” from “some ID” for clarity.
newtype IdState = IdState Id
type Id = Int
-- Return new sphere and updated state.
newSphere :: IdState -> (Sphere, IdState)
newSphere s0 = let
(i, s1) = newId s0
in (Sphere i, s1)
-- Return new ID and updated state.
newId :: IdState -> (Id, IdState)
newId (IdState i) = (i, IdState (i + 1))
newSpheres3 :: IdState -> ((Sphere, Sphere, Sphere), IdState)
newSpheres3 s0 = let
(sphere1, s1) = newSphere s0
(sphere2, s2) = newSphere s1
(sphere3, s3) = newSphere s2
in ((sphere1, sphere2, sphere3), s3)
main :: IO ()
main = do
-- Generate some spheres with an initial ID of 0.
-- Ignore the final state with ‘_’.
let (spheres, _) = newSpheres3 (IdState 0)
-- Do stuff with them.
print spheres
Obviously this is very repetitive and error-prone, since we have to pass the correct state along at each step. The State type has a Monad instance that abstracts out this repetitive pattern and lets you use do notation instead:
import Control.Monad.Trans.State (State, evalState, state)
newSphere :: State IdState Sphere
newSphere = do
i <- newId
pure (Sphere i)
-- or:
-- newSphere = fmap Sphere newId
-- newSphere = Sphere <$> newId
-- Same function as before, just wrapped in ‘State’.
newId :: State IdState Id
newId = state (\ (IdState i) -> (i, IdState (i + 1)))
-- Much simpler!
newSpheres3 :: State IdState (Sphere, Sphere, Sphere)
newSpheres3 = do
sphere1 <- newSphere
sphere2 <- newSphere
sphere3 <- newSphere
pure (sphere1, sphere2, sphere3)
-- or:
-- newSpheres3 = (,,) <$> newSphere <*> newSphere <*> newSphere
main :: IO ()
main = do
-- Run the ‘State’ action and discard the final state.
let spheres = evalState newSpheres3 (IdState 0)
-- Again, do stuff with the results.
print spheres
State is what I would reach for normally, since it can be used within pure code, and combined with other effects without much trouble using StateT, and because it’s actually immutable under the hood, just an abstraction on top of passing values around, you can easily and efficiently save and roll back states.
If you want to use randomness, Unique, or make your state actually mutable, you generally have to use IO, because IO is specifically about breaking referential transparency like that, typically by interacting with the outside world or other threads. (There are also alternatives like ST for putting imperative code behind a pure API, or concurrency APIs like Control.Concurrent.STM.STM, Control.Concurrent.Async.Async, and Data.LVish.Par, but I won’t go into them here.)
Fortunately, that’s very similar to the State code above, so if you understand how to use one, it should be easier to understand the other.
With random IDs using IO (not guaranteed to be unique):
import System.Random
newSphere :: IO Sphere
newSphere = Sphere <$> newId
newId :: IO Id
newId = randomRIO (1, maxBound :: Id)
newSpheres3 :: IO (Sphere, Sphere, Sphere)
newSpheres3 = (,,) <$> newSphere <*> newSphere <*> newSphere
main :: IO ()
main = do
spheres <- newSpheres3
print spheres
With Unique IDs (also not guaranteed to be unique, but unlikely to collide):
import Data.Unique
newSphere :: IO Sphere
newSphere = Sphere <$> newId
newId :: IO Id
newId = hashUnique <$> newUnique
-- …
With sequential IDs, using a mutable IORef:
import Data.IORef
newtype IdSource = IdSource (IORef Id)
newSphere :: IdSource -> IO Sphere
newSphere s = Sphere <$> newId s
newId :: IdSource -> IO Id
newId (IdSource ref) = do
i <- readIORef ref
writeIORef ref (i + 1)
pure i
-- …
You’re going to have to understand how to use do notation and functors, applicatives, and monads at some point, because that’s just how effects are represented in Haskell. You don’t necessarily need to understand every detail of how they work internally in order to just use them, though. I got pretty far when I was learning Haskell with some rules of thumb, like:
A do statement can be:
An action: (action :: m a)
Often m () in the middle
Often pure (expression :: a) :: m a at the end
A let binding for expressions: let (var :: a) = (expression :: a)
A monadic binding for actions: (var :: a) <- (action :: m a)
f <$> action applies a pure function to an action, short for do { x <- action; pure (f x) }
f <$> action1 <*> action2 applies a pure function of multiple arguments to multiple actions, short for do { x <- action1; y <- action2; pure (f x y) }
action2 =<< action1 is short for do { x <- action1; action2 x }

MonadRandom, State and monad transformers

I'm writing some code (around card-playing strategies) that uses State and recursion together. Perhaps this part doesn't need to actually (it already feels clumsy to me, even as a relative beginner), but there are other parts that probably do so my general question stands...
My initial naive implementation is entirely deterministic (the choice of bid is simply the first option provided by the function validBids):
bidOnRound :: (DealerRules d) => d -> NumCards -> State ([Player], PlayerBids) ()
bidOnRound dealerRules cardsThisRound = do
(players, bidsSoFar) <- get
unless (List.null players) $ do
let options = validBids dealerRules cardsThisRound bidsSoFar
let newBid = List.head $ Set.toList options
let p : ps = players
put (ps, bidsSoFar ++ [(p, newBid)])
bidOnRound dealerRules cardsThisRound
And I call it from:
playGame :: (DealerRules d, ScorerRules s) => d -> s -> StateT Results IO ()
...
let (_, bidResults) = execState (bidOnRound dealerRules cardsThisRound) (NonEmpty.toList players, [])
Now I'm aware that I need to bring randomness into this and several other parts of the code. Not wanting to litter IO everywhere, nor pass round random seeds manually all the time, I feel I should be using MonadRandom or something. A library I'm using uses it to good effect. Is this a wise choice?
Here's what I tried:
bidOnRound :: (DealerRules d, RandomGen g) => d -> NumCards -> RandT g (State ([Player], PlayerBids)) ()
bidOnRound dealerRules cardsThisRound = do
(players, bidsSoFar) <- get
unless (List.null players) $ do
let options = Set.toList $ validBids dealerRules cardsThisRound bidsSoFar
rnd <- getRandomR (0 :: Int, len options - 1)
let newBid = options List.!! rnd
let p : ps = players
put (ps, bidsSoFar ++ [(p, newBid)])
bidOnRound dealerRules cardsThisRound
but I'm uncomfortable already, plus can't work out how to call this, e.g. using evalRand in combination with execState etc. The more I read on MonadRandom, RandGen and mtl vs others, the less sure I am of what I'm doing...
How should I neatly combine Randomness and State and how do I call these properly?
Thanks!
EDIT: for reference, full current source on Github.
Well how about an example to help you out. Since you didn't post a full working code snippet I'll just replace a lot of your operations and show how the monads can be evaluated:
import Control.Monad.Trans.State
import Control.Monad.Random
import System.Random.TF
bidOnRound :: (RandomGen g) => Int -> RandT g (State ([Int], Int)) ()
bidOnRound i =
do rand <- getRandomR (10,20)
s <- lift $ get
lift $ put ([], i + rand + snd s)
main :: IO ()
main =
do g <- newTFGen
print $ flip execState ([],1000) $ evalRandT (bidOnRound 100) g
The thing to note here is you "unwrap" the outer monad first. So if you have RandT (StateT Reader ...) ... then you run RandT (ex via evalRandT or similar) then the state then the reader. Secondly, you must lift from the outer monad to use operations on the inner monad. This might seem clumsy and that is because it is horribly clumsy.
The best developers I know - those whose code I enjoy looking at and working with - extract monad operations and provide an API with all the primitives complete so I don't need to think about the structure of the monad while I'm thinking about the structure of the logic I'm writing.
In this case (it will be slightly contrived since I wrote the above without any application domain, rhyme or reason) you could write:
type MyMonad a = RandT TFGen (State ([Int],Int)) a
runMyMonad :: MyMonad () -> IO Int
runMyMonad f =
do g <- newTFGen
pure $ snd $ flip execState ([],1000) $ evalRandT f g
With the Monad defined as a simple alias and execution operation the basic functions are easier:
flipCoin :: MyMonad Int
flipCoin = getRandomR (10,20)
getBaseValue :: MyMonad Int
getBaseValue = snd <$> lift get
setBaseValue :: Int -> MyMonad ()
setBaseValue v = lift $ state $ \s -> ((),(fst s, v))
With that leg-work out of the way, which is usually a minor part of making a real application, the domain specific logic is easier to write and certainly easier to read:
bidOnRound2 :: Int -> MyMonad ()
bidOnRound2 i =
do rand <- flipCoin
old <- getBaseValue
setBaseValue (i + rand + old)
main2 :: IO ()
main2 = print =<< runMyMonad (bidOnRound2 100)

Monad State And get function

type InterpreterMonad = StateT (Env, Env) (ErrorT String IO ) ()
interpreter :: Stmts -> InterpreterMonad
interpreter (Statements s EmptyStmts) = interpreteStmt s
interpreter (Statements s stmts) = interpreteStmt s >>= \m -> (interpreter stmts)
-- definicja funkcji
interpreteStmt :: Stmt -> InterpreterMonad
interpreteStmt (DefFun (VarName name) args typ begin statements end) = get >>=
\(envOut, (sInnerEnvFun, sInnerEnvEVal)) -> case (Map.lookup (VarName name) sInnerEnvFun) of
Nothing -> put ( ( envOut, ((Map.insert (VarName name) (DefFun (VarName name) args typ begin statements end) sInnerEnvFun), sInnerEnvEVal)) ) >>= \_ -> return ()
(Just x) -> throwError "ee"
Hi, I cannot understand why get and put functions can be called? How the Haskell know "where" is state? I have a problem in favour of imperative programming- after all, such function we have to call on object ( as method) or pass state by argument.
When you use the State monad functions get and put and sequence them with do notation or >>=, you are constructing a "recipe" (often called an "action") for accessing and modifying local state that will eventually be used with some particular state. For example:
increment = do
count <- get
put (count + 1)
This action can then be used with the part of the State interface that lets you run actions by passing in a state. We will use execState, which returns the state and discards the other value, to demonstrate:
> execState increment 1
2
execState is what connects the action increment with the state it modifies. increment by itself doesn't contain any state, it is just a recipe for interacting with state that you must evaluate later.
The State type (which StateT is a generalization of) is implemented something like this:
newtype State s a =
State { -- A function that takes a state of type `s` and
-- produces a pair `(a, s)` of a result of type
-- `a` and a new state of type `s`.
runState :: s -> (a, s)
}
This type has instances for Functor, Applicative and Monad that allow you to assemble complex State values out of simpler ones:
-- I won't write the code for these here
instance Functor (State s) where ...
instance Applicative (State s) where ...
instance Monad (State s) where ...
Basically, State is a shortcut for chaining functions of types that look like s -> (a, s), but without having to pass around those s values explicitly. When you're ready to actually "feed" a State action with an s value you use one of these operations (depending on which part of the result you want):
runState :: State s a -> s -> (a, s)
evalState :: State s a -> s -> a
execState :: State s a -> s -> s
What put and get do, then, is interact with the "hidden" state argument that the State type implicitly passes around in your code. Their implementations are something like this:
get :: State s s
get = State (\s -> (s, s))
put :: s -> State s ()
put s = State (\_ -> ((), s))
And that's it!

"Persistently" Impure (IO) Vectors in Haskell, with database-like persistent interface

I have a computation that is best described as iterative mutations on a vector; the final result is the final state of the vector.
The "idiomatic" approach to making this functional, I think, is to simply pass on a new vector object along whenever it is "modified". So your iterative method would be operate_on_vector :: Vector -> Vector, which takes in a vector and outputs the modified vector, which is then fed through the method again.
This method is pretty straightforward and I had no problems implementing it, even being new to Haskell.
Alternatively, one could encapsulate all of this in a State monad and pass along a constantly re-created and modified vector as the state value.
However, I suffer a huge, huge performance cost, as these calculations are pretty intensive, the iterations many (on the order of millions) and the data vectors can get pretty large (on the order of at least thousands of primitives). Re-creating a new vector in memory at every step of the iteration seems pretty costly, data collection or not.
Then I considered how IO works -- it can be seen as basically like State, except the state value is the "World", which is constantly changing.
Maybe I could use something that is like IO to "operate" on a "world"? And the "world" would be the vector in-memory? Sort of like a database query, but everything is in memory.
For example with io you could do
do
putStrLn "enter something"
something <- getLine
putStrLine $ "you entered " ++ something
which can be seen as "performing" putStrLn and "modifying" the World object, returning a new World object and feeding it into the next function, which queryies the world object for a string that is the result of the modification, and then returns another world object after another modification.
Is there anything like that that can do this for mutable vectors?
do
putInVec 0 9 -- index 0, value 9
val <- getFromVec 0
putInVec 0 (val + 1)
, with "impure" "mutable" vectors, instead of passing along a new modified vector at each step.
I believe you can do this using mutable vector and a thin wrapper over Reader + ST (or IO) monad.
It can look like this:
type MyVector = IOVector $x -- Use your own elements type here instead of $x
newtype VectorIO a = VectorIO (ReaderT MyVector IO a) deriving (Monad, MonadReader, MonadIO)
-- You will need GeneralizedNewtypeDeriving extension here
-- Run your computation over an existing vector
runComputation :: MyVector -> VectorIO a -> IO MyVector
runComputation vector (VectorIO action) = runReaderT action vector >> return vector
-- Run your computation over a new vector of the specified length
runNewComputation :: Int -> VectorIO a -> IO MyVector
runNewComputation n action = do
vector <- new n
runComputation vector action
putInVec :: Int -> $x -> VectorIO ()
putInVec idx val = do
v <- ask
liftIO $ write v idx val
getFromVec :: Int -> VectorIO $x
getFromVec idx = do
v <- ask
liftIO $ read v idx
That's really all. You can use VectorIO monad to perform your computations, just like you wanted in your example. If you do not want IO but want pure computations, you can use ST monad; modifications to the code above will be trivial.
Update
Here is an ST-based version:
{-# LANGUAGE GeneralizedNewtypeDeriving, FlexibleInstances, MultiParamTypeClasses, Rank2Types #-}
module Main where
import Control.Monad
import Control.Monad.Trans.Class
import Control.Monad.Reader
import Control.Monad.Reader.Class
import Control.Monad.ST
import Data.Vector as V
import Data.Vector.Mutable as MV
-- Your type of the elements
type E = Int
-- Mutable vector which will be used as a context
type MyVector s = MV.STVector s E
-- Immutable vector compatible with MyVector in its type
type MyPureVector = V.Vector E
-- Simple monad stack consisting of a reader with the mutable vector as a context
-- and of an ST action
newtype VectorST s a = VectorST (ReaderT (MyVector s) (ST s) a) deriving Monad
-- Make the VectorST a reader monad
instance MonadReader (MyVector s) (VectorST s) where
ask = VectorST $ ask
local f (VectorST a) = VectorST $ local f a
reader = VectorST . reader
-- Lift an ST action to a VectorST action
liftST :: ST s a -> VectorST s a
liftST = VectorST . lift
-- Run your computation over an existing vector
runComputation :: MyVector s -> VectorST s a -> ST s (MyVector s)
runComputation vector (VectorST action) = runReaderT action vector >> return vector
-- Run your computation over a new vector of the specified length
runNewComputation :: Int -> VectorST s a -> ST s (MyVector s)
runNewComputation n action = do
vector <- MV.new n
runComputation vector action
-- Run a computation on a new mutable vector and then freeze it to an immutable one
runComputationPure :: Int -> (forall s. VectorST s a) -> MyPureVector
runComputationPure n action = runST $ do
vector <- runNewComputation n action
V.unsafeFreeze vector
-- Put an element into the current vector
putInVec :: Int -> E -> VectorST s ()
putInVec idx val = do
v <- ask
liftST $ MV.write v idx val
-- Retrieve an element from the current vector
getFromVec :: Int -> VectorST s E
getFromVec idx = do
v <- ask
liftST $ MV.read v idx

How can I initialize state in a hidden way in Haskell (like the PRNG does)?

I went through some tutorials on the State monad and I think I got the idea.
For example, as in this nice tutorial:
import Data.Word
type LCGState = Word32
lcg :: LCGState -> (Integer, LCGState)
lcg s0 = (output, s1)
where s1 = 1103515245 * s0 + 12345
output = fromIntegral s1 * 2^16 `div` 2^32
getRandom :: State LCGState Integer
getRandom = get >>= \s0 -> let (x,s1) = lcg s0
in put s1 >> return x
OK, so I can use getRandom:
*Main> runState getRandom 0
(0,12345)
*Main> runState getRandom 0
(0,12345)
*Main> runState getRandom 1
(16838,1103527590)
But I still need to pass the seed to the PRNG every time I call it. I know that the
PRNG available in Haskell implementations does not need that:
Prelude> :module Random
Prelude Random> randomRIO (1,6 :: Int)
(...) -- GHC prints some stuff here
6
Prelude Random> randomRIO (1,6 :: Int)
1
So I probably misunderstood the State monad, because what I could see in most tutorials
doesn't seem to be "persistent" state, but just a convenient way to thread state.
So... How can I have state that is automatically initialized (possible from some
function that uses time and other not-very-predictable data), like the Random module
does?
Thanks a lot!
randomRIO uses the IO monad. This seems to work nicely in the interpreter because the interpreter also works in the IO monad. That's what you are seeing in your example; you can't actually do that at the top-level in code -- you would have to put it in a do-expression like all monads anyway.
In general code you should avoid the IO monad, because once your code uses the IO monad, it is tied to external state forever -- you can't get out of it (i.e. if you have code that uses the IO monad, any code that calls it also has to use the IO monad; there is no safe way to "get out" of it). So the IO monad should only be used for things like accessing the external environment, things where it is absolutely required.
For things like local self-contained state, you should not use the IO monad. You can use the State monad as you have mentioned, or you can use the ST monad. The ST monad contains a lot of the same features as the IO monad; i.e. there is STRef mutable cells, analogous to IORef. And the nice thing about ST compared to IO is that when you are done, you can call runST on an ST monad to get the result of the computation out of the monad, which you can't do with IO.
As for "hiding" the state, that just comes as part of the syntax of do-expressions in Haskell for monads. If you think you need to explicitly pass the state, then you are not using the monad syntax correctly.
Here is code that uses IORef in the IO Monad:
import Data.IORef
foo :: IO Int -- this is stuck in the IO monad forever
foo = do x <- newIORef 1
modifyIORef x (+ 2)
readIORef x
-- foo is an IO computation that returns 3
Here is code that uses the ST monad:
import Control.Monad.ST
import Data.STRef
bar :: Int
bar = runST (do x <- newSTRef 1
modifySTRef x (+ 2)
readSTRef x)
-- bar == 3
The simplicity of the code is essentially the same; except that in the latter case we can get the value out of the monad, and in the former we can't without putting it inside another IO computation.
secretStateValue :: IORef SomeType
secretStateValue = unsafePerformIO $ newIORef initialState
{-# NOINLINE secretStateValue #-}
Now access your secretStateValue with normal readIORef and writeIORef, in the IO monad.
So I probably misunderstood the State monad, because what I could see in most tutorials doesn't seem to be "persistent" state, but just a convenient way to thread state.
The state monad is precisely about threading state through some scope.
If you want top level state, that's outside the language (and you'll have to use a global mutable variable). Note how this will likely complicated thread safety of your code -- how is that state initialized? and when? And by which thread?

Resources