How to store ST monad thing?

How to store ST monad thing? - haskell

I want to handle/store random generator(Gen (ST {..}) outside of ST monad, but I couldn't find how to do.
Background
I'm under working for some simulation which uses random heavily.
With profiling, I knew that make random numbers takes more than 50% of process time.
To make random number, I use mwc-random and SFMT.
Because of speed issue, I mainly use SFMT.
However, comeparing with SFMT, mwc-random have richer interfaces that I need(like normal, bernoulli, ..).
After benchmark and read codes, I understand that mwc-random is not too slow than SFMT when it is used on ST monad.
(SFMT on IO < MWC on ST << MWC on IO < SFMT on ST)
So, I want to make and handle MWC random generator on ST monad.
However, I cannot take this generator out from ST monad as same as other ST things(e.g. STRef).
Problem
Is there any way to handle/store this random generator outside of ST monad safely?
I tried to study from many packages/codes with STRef or something others, but I couldn't figure it out.
Example
I use random generator in the simulation like this way.
import qualified System.Random.MWC as MWC
import GHC.Prim
import Control.Monad
data World = World { randomGen :: MWC.Gen RealWorld }
initWorld = do gen <- MWC.create
return $ World gen
something gen = do num <- MWC.uniformR (1,100) gen :: IO Int
print num
main = do world <- initWorld
replicateM_ 100 $ something (randomGen world)
But, this code does not works.
import qualified System.Random.MWC as MWC
import Control.Monad
import Control.Monad.Primitive
import Control.Monad.ST
data World s = World { randomGen :: MWC.Gen (PrimState (ST s))}
initWorld :: ST s (World s)
initWorld = do gen <- MWC.create
return $ World gen
something gen = do
let num :: Int
num = runST $ do num <- MWC.uniformR (1,100) gen
return num
print num
main = do let world = runST initWorld
replicateM_ 100 $ something (randomGen world)
I want rewrite this code to work with something.
Do I need to define/rewrite data structure or do something other?
Is there more smart way?
Points:
I need to handle a random generator (like Gen (PrimState (ST s))) to reproduce results.
So, I do not want to produce ad-hoc random generator.
I do not wants to save/restore seed. It has too big overhead.
(save/restore seed takes x12~15 time more than generate one random number)
It is slower than using on IO monad, so I do not need to do on ST monad.
I do not want to use unsafe* functions.

You shouldn't try to manipulate the generator outside of the ST monad. Because of the type of runST, trying to use things which live "inside" the state thread "outside" of it is non-nonsensical. Imagine you had a function of the following type (which is the function you are trying to write):
something :: MWC.Gen s -> Int
something gen = runST ...
In order to generate random numbers, some stateful computations must be done with the data inside of the Gen. At which point will those computations be done? How many times will they be done, if at all? Most importantly - how can something be generating random numbers - it is a pure function, after all, so it must return the same value for the same input.
Instead, you should thread the state along, and call runST at the end:
something :: MWC.Gen s -> ST s Int
something = MWC.uniformR (1,100)
main = mapM_ print $ runST $ do
w0 <- initWorld
replicateM 100 (something $ randomGen w0)

Related

Create random numbers in a reproducible way and hide generator threading (Using Haskell Monad)

I need to create random data in Haskell.
I want my code to be:
a) reproducible from a seed
b) the threading of generators to be implicit
I understand Monads generally and the way that Random Generators work.
My approach is to thread the generator through the code so I can reproduce the random numbers but want to hide the threading of the generators in a Monad.
I'm thinking that the State Monad is a good approach.
Here's some simple code:
type Gen a = State StdGen a
roll :: Gen Int
roll = state $ randomR (1, 6)
roll2 :: Gen Int
roll2 = (+) <$> roll <*> roll
test :: Int -> IO ()
test seed = do
let gen = mkStdGen seed
print (evalState roll gen)
print (evalState roll gen)
print (evalState roll2 gen)
print (evalState roll2 gen)
I'm trying to use State so that I can push the threading of the generator into the State Monad but the results of roll are the same and results of roll2 are the same. I can see that this is because I'm passing gen into the functions multiple times so of course it would produce the same output. So that makes me think I need to get a new generator from each function. But then I'm back to having to thread the generator through the code which is what I'm trying to avoid by using State. I feel like I'm missing a trick!
I explored MonadRandom too and that did push the threading away from my code but I couldn't see how to make that approach be reproducible.
I've hunted a lot and tried many things but seem to always either be able to hide the generators OR make the code reproducible but not both.
I'm keen to use a Monad more specific than IO.
I'm also going to build a series of more complex functions which will generate random lists of numbers so I need to have a simple way to make these random functions rely on each other. I managed that with MonadRandom but again I couldn't see how that could be reproducible.
Any help appreciated.

If you needn't interleave IO with randomness, as here, then the answer is just to lump your State actions together into one with the Monad operations (they're the thing passing the state around for you!).
test :: Int -> IO ()
test seed = do
print a
print b
print c
print d
where
(a,b,c,d) = flip evalState (mkStdGen seed) $ do
a <- roll
b <- roll
c <- roll2
d <- roll2
return (a,b,c,d)
If you will need to interleave IO and randomness, then you will want to look into StateT StdGen IO as your monad instead of using State StdGen and IO separately. That might look like this, say:
roll :: MonadState StdGen m => m Int
roll = state (randomR (1,6))
roll2 :: MonadState StdGen m => m Int
roll2 = (+) <$> roll <*> roll
test :: (MonadState StdGen m, MonadIO m) => m ()
test = do
roll >>= liftIO . print
roll >>= liftIO . print
roll2 >>= liftIO . print
roll2 >>= liftIO . print
(You could then use e.g. evalStateT test (mkStdGen seed) to turn this back into an IO () action, or embed it into a larger computation if there were further random things you needed to generate and do IO about.)
MonadRandom does little more than package up StateT StdGen in a way that lets you still use non-seed state, so I encourage you to reconsider using it. evalRand and evalRandT from Control.Monad.Random.Lazy (or .Strict) shouldy give you the repeatability you need; if they don't, you should open a fresh question with the full details of what you tried and how it went wrong.

Normally, it's pretty much the whole point of a random generator that you don't always get the same result. And that's the reason why you use a state monad: to pass on the updated generator, so that the next random event will actually be different.
If you want always the same value, then there's not really any reason to use special random tooling at all – just generate one value once (or two values), then pass it on whereever needed, like you would pass on another variable.
test :: IO ()
test = do
[dice0, dice1] <- replicateM 2 $ randomRIO (1,6)
print dice0
print dice0
print $ dice0+dice1
print $ dice0+dice1

Haskell random numbers

I am trying to generate a sample of random numbers in Haskell
import System.Random
getSample n = take n $ randoms g where
g = newStdGen
but it seems I am not quite using newStdGen the right way. What am I missing?

First off, you probably don't want to use newStdGen. The biggest problem is that you'll get a different seed every time you run your program, so no results will be reproducible. In my opinion, mkStdGen is a better choice as it requires you to give it a seed. This means you will get the same sequence of (pseudo)random numbers every time. If you want a different sequence, just change the seed.
The second problem with newStdGen is that since it's impure, you'll end up in the IO monad which can be a bit inconvenient.
sample :: Int -> IO [Int]
sample n = do
gen <- newStdGen
return $ take n $ randoms gen
You can use do-notation to 'extract' the values and then sum them:
main :: IO ()
main = do
xs <- sample 10
s = sum xs
print s
Or you could 'fmap' the function over the result (but notice that at some point you will probably need to extract the value):
main :: IO ()
main = do
s <- fmap sum $ sample 10
print s
The fmap function is a generalized version of map. Just like map applies a function to the values inside a list, fmap can apply a function to values inside IO.
Another problem with this sample function is that if we call it again, it starts with a fresh seed instead of continuing the previous (pseudo)random sequence. Again, this make reproducing results impossible. In order to fix this problem, we need to pass in the seed and return a new seed. Unfortunately, randoms does not return the next seed for us, so we'll have to write this from scratch using random.
sample :: Int -> StdGen -> ([Int],StdGen)
sample n seed1 = case n of
0 -> ([],seed1)
k -> let (rs,seed2) = sample (k-1) seed1
(r, seed3) = random seed2
in ((r:rs),seed3)
Our main function is now
main :: IO ()
main = do
let seed1 = mkStdGen 123456
(xs,seed2) = sample 10 seed1
s = sum xs
(ys,seed3) = sample 10 seed2
t = sum ys
print s
print t
I know this seems like an awful lot of work just to to use random numbers, but the advantages are worth it. We can generate all of our randomness with a single seed which guarantees that the results can be reproduced.
Of course, this being Haskell, we can take advantage of Monads to get rid of all the manual threading of the seed values. This is a slightly more advanced method, but well worth learning since monads are ubiquitous in Haskell code.
We need these imports:
import System.Random
import Control.Monad
import Control.Applicative
Then we'll create a newtype which represents the action of turning a seed into a value and the next seed.
newtype Rand a = Rand { runRand :: StdGen -> (a,StdGen) }
We need Functor and Applicative instances or GHC will complain, but we can avoid implementing them for this example.
instance Functor Rand
instance Applicative Rand
And now for the Monad instance. This is where the magic happens. The >>= function (called bind) is the one place where we specify how to thread the seed value through the computation.
instance Monad Rand where
return x = Rand ( \seed -> (x,seed) )
ra >>= f = Rand ( \s1 -> let (a,s2) = runRand ra s1
in runRand (f a) s2 )
newRand :: Rand Int
newRand = Rand ( \seed -> random seed )
Now our sample function is extremely simple! We can take advantage of replicateM from Control.Monad which repeats a given action and accumulates the results in a list. All that funny business with the seed values is taken care of behind the scenes
sample :: Int -> Rand [Int]
sample n = replicateM n newRand
main :: IO ()
main = do
let seed1 = mkStdGen 124567
(xs,seed2) = runRand (sample 10) seed1
s = sum xs
print s
We can even stay inside the Rand monad if we need to generate random values multiple times.
main :: IO ()
main = do
let seed1 = mkStdGen 124567
(xs,seed2) = flip runRand seed1 $ do
x <- newRand
bs <- sample 5
cs <- sample 10
return $ x : (bs ++ cs)
s = sum xs
print s
I hope this helps!

Random numbers and getStdGen

From the Haskell wikibook we have:
import Control.Monad
import Control.Monad.Trans.State
import System.Random
type GeneratorState = State StdGen
rollDie :: GeneratorState Int
rollDie = do
generator <- get
let (value, newGenerator) = randomR (1,6) generator
put newGenerator
return value
If we execute:
evalState rollDie (mkStdGen 0)
then we get a return type of Int.
This much I understand, but I am wondering if it is possible to wire into this logic the use of the system generator accessed by the function getStdGen. The getStdGen function operates in the IO monad, and my question is (surely this must be the MOST often asked Haskell question) how can you get the generator out of the IO context to use in the non IO monad code above?
Apologies for the newbie question. I am aware that one should not use unsafePerformIO, but otherwise perplexed.

The core of the matter is that it's impossible to produce random numbers out of thin air. You either seed a generator with some input (that's what you do with mkStdGen 0), or you can take the system one with getStdGen.
The problem with mkStdGen 0 is that it's a constant, so your stream of random numbers will show its pseudo-randomness very clearly by being a constant too.
The problem with getStdGen is that it's in the IO monad.
You can't get the generator out of IO to use in a pure computation, that's the whole point of making IO a monad. But within IO, you can bind it using the normal do-notation:
main = do
gen <- getStrGen
print $ evalState rollDie gen
Of course, it'll only produce unique results once.
I heartily concur with the comment recommending MonadRandom. I've been using it for my random-based computations since I found out about it, and haven't looked back.

MonadRandom: why stack overflow happens?

This question is certainly for stackoverflow.com
here is the sample
module Main where
import Control.Monad.Random
import Control.Exception
data Tdata = Tdata Int Int Integer String
randomTdata :: (Monad m, RandomGen g) => RandT g m Tdata
randomTdata = do
a <- getRandom
b <- getRandom
c <- getRandom
return $ Tdata a b c "random"
manyTdata :: IO [Tdata]
manyTdata = do
g <- newStdGen
evalRandT (sequence $ repeat randomTdata) g
main = do
a <- manyTdata
b <- evaluate $ take 1 a
return ()
after compilation this return
Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize -RTS' to increase it
How can it happen ? Is MonadRandom not lazy or what else ? And how to define the cause of stack overflow in cases like that ?

The issue arises because you are building IO into your manyTdata function.
The monad transformer ends up being of type RandT g IO Tdata. Because each element of
your infinite list can consist of IO actions, the entirety of the infinite list
returned by manyTdata must be evaluated completely before the function can return
any results.
The simplest solution would be to use Rand instead of RandT, as using the tranformer
isn't really useful here, anyway; you could also change the base monad to something like
the Identity monad by changing manyTdata to
manyTdata :: IO [Tdata]
manyTdata = do
g <- newStdGen
return $ runIdentity $ evalRandT (sequence $ repeat randomTdata) g
Which will terminate in a finite amount of time. The error concerning your stack size
is simply a result of recursively expanding your list of IO actions. Your code says to sequence all of these actions, so they all have to be performed, it has nothing to do with laziness.
Something else to think about, rather than using randomTdata, consider
making Tdata an instance of the Random class.

How can I initialize state in a hidden way in Haskell (like the PRNG does)?

I went through some tutorials on the State monad and I think I got the idea.
For example, as in this nice tutorial:
import Data.Word
type LCGState = Word32
lcg :: LCGState -> (Integer, LCGState)
lcg s0 = (output, s1)
where s1 = 1103515245 * s0 + 12345
output = fromIntegral s1 * 2^16 `div` 2^32
getRandom :: State LCGState Integer
getRandom = get >>= \s0 -> let (x,s1) = lcg s0
in put s1 >> return x
OK, so I can use getRandom:
*Main> runState getRandom 0
(0,12345)
*Main> runState getRandom 0
(0,12345)
*Main> runState getRandom 1
(16838,1103527590)
But I still need to pass the seed to the PRNG every time I call it. I know that the
PRNG available in Haskell implementations does not need that:
Prelude> :module Random
Prelude Random> randomRIO (1,6 :: Int)
(...) -- GHC prints some stuff here
6
Prelude Random> randomRIO (1,6 :: Int)
1
So I probably misunderstood the State monad, because what I could see in most tutorials
doesn't seem to be "persistent" state, but just a convenient way to thread state.
So... How can I have state that is automatically initialized (possible from some
function that uses time and other not-very-predictable data), like the Random module
does?
Thanks a lot!

randomRIO uses the IO monad. This seems to work nicely in the interpreter because the interpreter also works in the IO monad. That's what you are seeing in your example; you can't actually do that at the top-level in code -- you would have to put it in a do-expression like all monads anyway.
In general code you should avoid the IO monad, because once your code uses the IO monad, it is tied to external state forever -- you can't get out of it (i.e. if you have code that uses the IO monad, any code that calls it also has to use the IO monad; there is no safe way to "get out" of it). So the IO monad should only be used for things like accessing the external environment, things where it is absolutely required.
For things like local self-contained state, you should not use the IO monad. You can use the State monad as you have mentioned, or you can use the ST monad. The ST monad contains a lot of the same features as the IO monad; i.e. there is STRef mutable cells, analogous to IORef. And the nice thing about ST compared to IO is that when you are done, you can call runST on an ST monad to get the result of the computation out of the monad, which you can't do with IO.
As for "hiding" the state, that just comes as part of the syntax of do-expressions in Haskell for monads. If you think you need to explicitly pass the state, then you are not using the monad syntax correctly.
Here is code that uses IORef in the IO Monad:
import Data.IORef
foo :: IO Int -- this is stuck in the IO monad forever
foo = do x <- newIORef 1
modifyIORef x (+ 2)
readIORef x
-- foo is an IO computation that returns 3
Here is code that uses the ST monad:
import Control.Monad.ST
import Data.STRef
bar :: Int
bar = runST (do x <- newSTRef 1
modifySTRef x (+ 2)
readSTRef x)
-- bar == 3
The simplicity of the code is essentially the same; except that in the latter case we can get the value out of the monad, and in the former we can't without putting it inside another IO computation.

secretStateValue :: IORef SomeType
secretStateValue = unsafePerformIO $ newIORef initialState
{-# NOINLINE secretStateValue #-}
Now access your secretStateValue with normal readIORef and writeIORef, in the IO monad.

So I probably misunderstood the State monad, because what I could see in most tutorials doesn't seem to be "persistent" state, but just a convenient way to thread state.
The state monad is precisely about threading state through some scope.
If you want top level state, that's outside the language (and you'll have to use a global mutable variable). Note how this will likely complicated thread safety of your code -- how is that state initialized? and when? And by which thread?

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string